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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 



1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 



2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

1 0 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity 

lir : ^'.ned polynucleotide and polypeptide sequences tave numerous v^v^ci ' ons in, for* r : rv- . 
example diagnostics, forensics, gene mapping; identification of mutations responsible for jjjjfr 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of die present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 

l. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides,along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 
sequences are designated as SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The 
polypeptides sequences are designated SEQ ID NO: 985*1 968, 2953-3936, 3943-3948 or 3955- 
3960. The nucleic acids and polypeptides are provided in the Sequence Listing. In the nucleic acids 
provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N 
10 is any of the four bases. In the amino acids provided in the Sequence listing,* conespondsto the 
stopcodon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridizetothecomplementofSEQIDNO: 1-984, 1969-2952, 3937-3942 or 3949-3954 under 
stringent hybridization conditions; nucleic acid sequences which are allelic variants or species 

15 homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. A polynucleotide comprising a nucleotide 
sequence having at least 90% identity to an identifying sequence of SEQ ID NO: 1-984, 1969-2952, 
3937-3942 or 3949-3954 or a degenerate variant or fragment thereof. The identifying sequence can 

20 be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
from the nuclei arid sequences of SEQIDijOu-^M, 1969-2952, 3937-3942 or 3S49-3P54. The%-/- 
sequence infonnation can be a segment of any GneofSEQIDNO:l-984, 1969-2952, 3937-3942or < 
3949-3954 that uniquely identifies or represents the sequence information of SEQ ID NO:l-984, 

25 1969-2952, 3937-3942 or 3949-3954. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
a nucleic acid array. In one embodiment, segments of sequence information is provided on a 
nucleic acid array to detect the polynucleotide that contains the segment The array can be designed 

30 to detect full-match or mismatch to the polynucleotide that contains the segment The collection 
can also be provided in a computer-readableformat 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 

3 5 reverse or direct complements) according to the invention have numerous applications in a variety 

2 
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of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 
5 In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 -984, 1969-2952, 

3937-3942 or 3949-3954 or novel segments or parts of the nucleic acids of the invention are used as 
primers in expression assays that are well known in the art In a particularly preferred embodiment, 
the nucleic acid sequences of SEQ H>NO:l-984, 1969-2952, 3937-3942 or 3949-3954 or novel 
segments or parts of the nucleic acids provided herein are used in diagnostics for identifying 

1 0 expressed genes or, as well known in the art and exemplified by Vollrath et al., Science 258 : 52-59 
(1 992), as expressed sequence tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -984, 
1 969-2952, 3937-3942 or 3949-3954 ; a polynucleotide comprising any of the foil length protein 

1 5 coding sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; and a polynucleotide 
comprising any of the nucleotide sequences of the mature protein coding sequences of SEQ ID 
NO.1-984, 1969-2952, 3937-3942 or 3949-3954. The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization 
conditions to (a) the complement of any one of the nucleotide sequences set forth in SEQ ED NO:l- 

20 984, 1969-2952, 3937-3942 or 3949-3954; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides^^ 

(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 
polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 

25 amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in SEQ ID NO: 985-1968, 2953-3936, 3943- 
3948 or 3955-3960, or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides with biological activity that are encoded by (a) any of the 

30 polynucleotides having a nucleotide sequence set forth in SEQ ID NO:l-984, 1969-2952, 3937- 
3942 or 3949-3954; or (b) polynucleotides that hybridize to the complement of the polynucleotides 
of (a) under stringent hybridization conditions. Biologically or immunologically active variants of 
any of the polypeptide sequences in the Sequence Listing, and "substantial equivalents" thereof 
(e.g., with at least about 65%, 70%, 75%, 80%, 85%^ 90%, 95%, 98% or 99% amino acid sequence 

35 identity)thatpreferablyretainbiologicalactivity are also contemplated. The polypeptides of the 
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inventionmay be wholly or partially chemically synthesized but are preferably produced by 
recombinantmeans using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 

10 under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 

15 as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 

20 using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
exrresr,ed sequence tags ibr id^?fyte<< expressed gciiss or, as well kaov^rn the art and . ; 

exemplified by Vollrath et aL, Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

25 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

30 markers, and as a food supplement 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 
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In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

: T Vivention also provides methods for th<? identifies dc-n of compounds iur? modulaftfl? 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/cr polypeptides . 
of the invention, Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 

5 
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5 



symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
effect such modulation either on the level of target gene/protein expression or target protein 
activity. 



10 



The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Tables 2 and 9); for which they have 
a signature region (as set forth in Tables 3 and 10); or for which they have homology to a gene 
family (as set forth in Tables 4 and 1 1). If no homology is set forth for a sequence, then the 
polypeptides and polynucleotides of the present invention are useful for a variety of applications, 
as described herein, including use in arrays for detection. 



4. DETAILED DESCRIPTION OF THE INVENTION 



15 



4.1 DEFINITIONS 



It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and 'the" include plural references unless the context clearly dictates otherwise. 

The term "active tt refers to those forms of the polypeptide which retain the biologic 
20 and/or immunologic activities of any naturally occurring polypeptide. According to the 

invention, the terms biologically active" or <c biological activity" refer to a protein or peptide 
having structure, regulatory or bio^inrdcai functions of a naturally occurri;^ molecv&v * ^Mftt^r • 
Likewise "immunologically active" or "immunological activity" refers to the capability of the ' 
natural, recombinant or synthetic polypeptide to induce a specific immune response in ' " 

25 appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
30 polynucleotides by base pairing. For example, the sequence S^AGT^' binds to the 

complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
35 strength of the hybridization between the nucleic acid strands. 



6 
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The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of germ cells for the production of gametes. Hie term "primordial germ 
5 cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of tenninaUydifferentiated cells that 

1 0 comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence 11 when the expression of the sequence is altered by the presence of the EMF. EMFs 

1 5 include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

20 sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antiscnse strand, to pepUde nucleic acid a j KA) or to any DNA-like or RNA-'ike material. IntS^ 
sequences herein A is adenine, C is cytosfue, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 5 

25 provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

30 The tenns "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 

"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 

35 nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
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nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30. 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
5 be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
EDNOs:l-20. 

1 0 Probes may, for example, be used to determine whether specific mRNA molecules are 

present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et ah, 1992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art Probes of the present invention, their preparation and/or labeling are elaborated in 

15 Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 

20 information from the nucleic acid sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954. The sequence information can be a segment of any one of SEQ ID NO: 1-1-984, 
1969-2952, 3937-3942 or 3949-3954 that uniquely identify or represents &* sequel - ^^p T 
information of that sequence of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, One 
such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 

25 mer is fully matched in the human genome is 1 in 300. In the human genome, there are three 
billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers exist, there are 
300 times more twenty-mers than there are base pairs in a set of human chromosomes. Using the 
same analysis, the probability for a seventeen-mer to be fully matched in the human genome is 
approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen- 

30 mer segments can be used. The probability that the fifteen-mer is fully matched in the expressed 
sequences is also approximately one in five because expressed sequences comprise less than 
approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 

35 with a single mismatch is calculated by multiplying the probability for a full match (l-^ 25 ) times the 
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increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 
5 The term "open reading Same," ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 

1 0 linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 

15 differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 

20 preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 500 amino acids, more preferably less 
ihan 200 amino acidr more preferably icss than 5 <0 acids ana mvat preferably less th^rrP^ 
100 amino acids* Preferably the peptide is from about 5 to about 200 amino acids. To be active, 
any polypeptide must have sufficient length to display biological and/or immunological activity. f 

25 The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 

have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 

30 length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 

35 protein portion may or may not include the initial methionine residue. The methionine residue 
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may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques as 
5 ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant M (or "analog'*) refers to any polypeptide differing from naturally 

10 occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g. y 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 

15 or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy 1 * in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 

20 prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part ex tne poiypeptide, to change characteristics sisch as Ligand-binding affinities, I^.oixhain^ 
affinities, or degradation/turnover rate. f 
Preferably, amino acid "substitutions" are the result of replacing one amino acid wiiii 

25 another amino acid having similar structural and/or chemical properties, Le. y conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved* For example, nonpolar (hydrophobic) amino adds include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 

30 neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
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insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
5 can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
10 cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
15 polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
20 at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if my&i&f;) only a solvent, buffer, ion, or other ^oriponsrtf normally preset *n 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 
25 The term "recombinant," when used herein to refer to a polypeptide or protein, means 

that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
30 unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
35 or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
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comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
5 in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product 

10 Hie term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 

15 have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 

20 The term "secreted" includes a protein that is transported across or through a membrane, 

including transport as a result of signal sequences in its amino acid sequence when it is expressed, 
in a suitab;* host cell. "Secreted" protest include without limitation proteins seeri &d wholly* 
(e.g. , soluble proteins) or partially {e.g , receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 

25 membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 

proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et al. (1998) Annu. Rev. Immunol. 
16:27-55) 

30 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 

35 art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
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to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 

In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (£e., the number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequent identity) and in a further variation of this embodiment, by no ' 
more that 5% (95% sequence identity). Substantially equivalent, mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 
90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 
98% sequence identity and most preferably at least 98% idenity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% identity, more preferably at least about 85% identity, more 
preferably at least about 90% identity, and most preferably at least about 95% identity, more 
preferably at least 98% and most preferably at least about 99% identity. For the purposes of the 
present invention, sequences having substantially equivalent biological activity and substantially 
equivalent expression characteristics are considered substantially equivalent. For the purposes of 
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determining equivalence, truncation of the mature sequence (e.g., via a mutation which creates a 
spurious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 
the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between 
sequences can also be determined by other methods known in the art, e.g. by varying 
5 hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
10 term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell, UMFs can be readily identified 
1 5 using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
20 marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
?outext dictates otherwise. ■ 

4J2 NUCLEIC ACIDS OF THE INVENTION 

25 Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; a 
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide sequence encoding the 

30 mature protein coding sequence of the polypeptides of any one of SEQ ID NO: 985-1968, 2953- 
3936, 3943-3948 or 3955-3960. The polynucleotides of the present invention also include, but 
are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the 
complement of any of the nucleotides sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954; (b) nucleotide sequences encoding any one of the amino acid sequences set forth 

35 in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960; (c) a 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 PCT/US01/04098 
polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a 
polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the 
polypeptides of SEQ ID NO:985-1968, 2953-3936, 3943-3948 or 3955-3960. Domains of 
interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like 
polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, or 
combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences disclosed 
herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence informationfor identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5* and 3' sequence can 
be obtained using methods known in the art For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotidesof SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954 can be obtained by screwing appropriate cDNA or gencrs DNA libraries under suitable ' $ 
hybridization conditions using any of the polynucleotides 1-984, 1969-2952, 3937- 

3942 or 3949-3954 or a portion thereofas a probe. Alternatively, the polynucleotides of SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 may be used as the basis for suitable primer(s) 
that allow identification and/or amplification of genes in appropriate genomic DNA or cDNA 
libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDN A and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 
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75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, and more typically at least about 90%, 91%, 92%, 93%, 94%, and even more 
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited 
above. 

5 Included within the scope of the nucleic acid sequences of the invention are nucleic acid 

sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements thereof, which 
fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater 
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 

10 20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 

polynucleotides of the invention) are contemplated Probes capable of specifically hybridizing to 
a polynucleotide can differentiate polynucleotide sequences of the invention from other 
polynucleotide sequences in the same family of genes or can differentiate human genes from 
genes of other species, and are preferably based on unique nucleotide sequences. 

1 5 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1 -984, 
1 969-2952, 3937-3942 or 3949-3954, a representative fragmentthereof, or a nucleotide sequence at 
least 90% identical, preferably 95% identical, to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 

20 3949-3954 with a sequence from another isolate of the same species. Furthermore, to accommodate 
codon variability, the invention includes nucleic acid molecules coding for the same amino acid 
sequences as do the qpscific ORFs disclosed herein. In other wordla. m the coding region of an 
ORF, substitution of one codon for another codon that encodes the same amino acid is expressly 
contemplated 

25 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, can be obtained by searching a 
database using an algorithm or a program. Preferably, a BLAST which stands for Basic Local 
Alignment Search Tool is used to search for local sequence alignments (Altshul, S J. J Mol. Evol. 
36 290-300 (1993) and Altschul SJF. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a 

30 FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 



16 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 PC1YUS01704098 
The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

5 The nucleic acid sequences of the invention are further directed to sequences which 

encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 

10 encoding the amino acid sequence variants are preferably constructed by mutating the 

polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 

1 5 hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

20 hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. JJ&amples of ienmttal insertr >ps *-fi-;?ude the hetciulcgous sign^ 
sequences necessary for secretion or for intracellular targeting fc different host cells and . - 

sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

25 In a preferred method, polynucleotides encoding the novel amino acid sequences are 

changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 

30 those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:1 83 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 

35 slightly in sequence from the corresponding region in the template DNA can generate the desired 
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amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant 
5 A further technique for generating amino acid variants is the cassette mutagenesis 

technique described in Wells et al., Gene 34:3 1 5 (1 985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et aL Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 
10 amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
15 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
20 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one ifSEQ ID inO: 1-984, 196V 3937- 
3942 or 3949-3954, or functional equivalents thereof, may be used to generate recombinant 'J 
DNA molecules that direct the expression of that nucleic acid, or a functional equivalent lliereo^ 
25 in appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook Jetal. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 

30 nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 

35 selectable marker for the host cell. Vectors according to the invention include expression 
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vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
5 having any of the nucleotide sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949- 
3954or a fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or viral 
vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-984, 
1969-2952, 3937-3942 or 3949-3954 or a fragment thereof is inserted, in a forward or reverse 

1 0 orientation. In the case of a vector comprising one of the ORFs of the present invention, the 
vector may further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. Large numbers of suitable vectors and promoters are known to those of skill 
in the art and are commercially available for generating the recombinant constructs of the present 
invention. The following vectors are provided by way of example. Bacterial: pBs, phagescript, 

15 PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, 
pKK223-3,pKK233-3,pDR540,pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, 
PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

Hie isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et aL, 

20 Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art General methods of expressing 
recombinant proteins are «iso knowr ?v<\ *re exemplified in R. Kaufman, Methods in * 
Enzymology 1 85, 537-566 (1 990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 

25 or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lacl, lacZ, T3, T7, gpt, 

30 lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-L Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli- 

35 and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 PCTYCS01/04098 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
5 preferably, a leader sequence capable of directing secretion of translated protein into the 

periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 

1 0 sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 

1 5 within the genera Psendomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 

20 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
rfi533"4 "backbone' 1 sections are combined vdth an appropriate promote? arid t-~ ?- structural ^ 
sequence to be expressed. Following transformation of a suitable host strain and growth of the * 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 

25 appropriate means (e.g. , temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, NaL Biotech. 17:870-872 (1999), incorporated herein by 

30 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 
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43 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949.3954, or ftagments, analogs or 
5 derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 

complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding 
strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In 
specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding 

10 strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, 
derivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID NO: 
1-984, 1969-2952, 3937-3942 or 3949-3954 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 

1 5 of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5* and 3' sequences which flank the coding region that are not 

20 translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-984, 19694952, 3937-3542 or 3949-3954;% anJsense nucleic acids of the invention -ran 
designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of a mRNA, but more 

25 preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of a mRNA. For example, the antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of a mRNA. An antisense oligonucleotide can be, for 
example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic 
acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions 

30 using procedures known in the art For example, an antisense nucleic acid (e.g., an antisense 
oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the molecules or to 
increase the physical stability of the duplex formed between the antisense and sense nucleic 
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
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Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosirie, 
5 inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-metbylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5 -methy laminomethy luracil, 5-methoxyaminomethyl-2-thiouraci], 
beta-D-mannosylqueosine, 5 -methoxycarboxymethyluracil, 5-methoxy uracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
10 queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 

uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (ie., RNA transcribed from the 

1 5 inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

20 protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense; nucleic iUf/Iecule that binds to DNA duplexes, through speciitc iiiiuraciionij j*fp • > 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

25 antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 

30 the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol HI promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
ct-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

35 double-stranded hybrids with complementary RNA in which, contrary to the usual {J-units, the 
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strands run parallel to each other (Gaultier et al (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2 , -o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
FEBS Lett 215: 327-330). 

5 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 

10 Thus, ribozymes {e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1 988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO: 1- 
984, 1969-2952, 3937-3942 or 3949-3954). For example, a derivative of a Tetrahymena L-19 

15 IVS RNA can be constructed in which the nucleotide sequence of the active site is 

complementary to the nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g. , 
Cechefa/. U.S. Pat No. 4,987,071 ; and Cech et al U.S. Pat No. 5,1 16,742. Alternatively, 
SECX mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from 
a pool of RNA molecules. See,e.g.,Bartelef al, (1993) Science 261:141 1-1418. 

20 Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structure?: that prevent transcription of the gc^ in target cells. S^u generally, fl2le~e. (1991) <f| 
Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Am. N. Y Acad Set 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

25 In various embodiments, the nucleic acids of the invention can be modified at the base 

moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et ah (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or TNAs" refer to nucleic acid 

30 mimics, e.g. , DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 

35 Perry-CKeefe et al. (1996) PNAS 93: 14670-675. 
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PNAs of the invention can be used in therapeutic and diagnostic applications. For 

example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 

gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 

PNAs of the invention can also be used, e.g. , in the analysis of single base pair mutations in a 

5 gene by, e.g. , PNA directed PCR clamping; as artificial restriction enzymes when used in 

combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 

primers for DNA sequence and hybridization (Hyrup et al (1996), above; Perry-O'Keefe (1996), 

above). 

In another embodiment, PNAs of the invention can be modified, e.g. , to enhance their 

1 0 stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in die art For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 

1 5 portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

20 phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5 -(4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite, can be used between the PNA 

strdtheS' end of DNA (Mag et al. (1989) Nucl Acid R?s 17: 5973-88). HTA monomers or? &gg£$.ri m 

coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3 f 

DNA segment (Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized * 

25 with a 5' DNA segment and a 3' PNA segment See, Petersen et al. (1975) Bioorg Med Chem 
LeU 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g y for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et al, 1989, Proc. Natl Acad Set USA. 86:6553-6556; 
30 Lemaitre etal, 1987, Proc. Natl. Acad. Set 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al., 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Phanru Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
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peptide, a hybridization triggered cross-Unking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



45 HOSTS 

5 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

1 0 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

1 5 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 

20 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 

encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 

i.'*jt. 

intron DNA may be inserted along with the heterologous pnm&ts; i?UA. If linked to the codiS^ 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

25 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaiyotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1 986)). The host cells containing one of the 

30 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any hostfvector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 

35 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. colt and A subtilis. 
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The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the C127, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A43 1 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional tennination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
S V40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacteria! culture are usually isolated by initial extraction JUom cell pellets, 5>Uov/?d by one 
more salting-out, aqueous ion exchange of size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell ly sing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyczs pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis. Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
may be necessary to modify the protein produced therein, for example by phosphorylation or 
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glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
5 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

10 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 

15 protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the % 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 

20 of a regulatory element, such as the deletion of a tissue-specific negative regulatory element 
. Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhaxcoi can be replaced by an enhancer that has broader or ciilierent ceil-typ n 3pcs:i.;city thail^' 
the natwvlly occurring elements. Here, the naturally occurring sequences are debted and new / 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 

25 the use of one or more selectable marker genes that are contiguous with the targeting DNA, 

allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 

30 selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 
5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

10 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the corresponding full 
length or mature protein. Polypeptides of the invention also include polypeptides preferably with 
biological or immunological activity that are encoded by: (a) a polynucleotide having any one of 

15 the nucleotide sequences set forth in SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or 
(b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 985- 
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 

20 amino acid sequences set forth as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960 
or the corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., at 
least about 65%, • - least about 70% ?! *?ast about 75%, at least about 80%, 81%, 82%, 'CZZi, & 
84%, more typically at least about 85%, 86%, 87%, 88%, 89%, and more typically at least about 
90%, 91%, 92%, 93%, 94%, and even raore typically at least about 95%, 96%, 97%, 98%, 99%, 

25 sequence identity that retain biological activity. Polypeptides encoded by allelic variants may 
have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID 
NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 

30 be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc. 114, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites* 
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The present invention also provides both full-length and mature forms (for example, 

without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 

5 polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 

of the protein is also determinable from the amino acid sequence of the full-length form. Where 

proteins of the present invention are membrane bound, soluble forms of the proteins are also 

provided. In such forms, part or all of the regions causing the proteins to be membrane bound 

are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

10 Protein compositions of the present invention may further comprise an acceptable carrier, 

such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 

1 5 nucleic acid fragment of the present invention (e.g: , an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

20 sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 

structural ^acL'or conformational characteristic 1||§P - i. 

in common therewith, including protein activity. This technique is particularly useful in 

producing small peptides and fragments of larger polypeptides. Fragments are useful, for Hx ' 

25 example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 

30 cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 

35 or proteins of the present invention. 
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The invention also relates to methods for producing a polypeptide comprising growing a 
culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
5 expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

10 In an alternative method, the polypeptide or protein is purified from bacterial cells which 

naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 

15 and immuno-affinity chromatography. See, e.g. , Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 

20 domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules vviich bind to the p;4^ptides. Tiuss noleculei> include but are not| 
limited to, for e.g., small molecules, molecules fiom combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 

25 activity in in vivo tissue culture or animal models that are well known in the art Inbrietthe 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be completed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 

30 cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 

specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955- 
3960. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g, as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
35 by somatic or germ cells containing a nucleotide sequence encoding the protein. 

30 
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The proteins provided herein also include proteins characterized by amino acid sequences 

similar to those of purified proteins but into which modification are naturally provided or 

deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 

made by those skilled in the art using known techniques. Modifications of interest in the protein 

5 sequences may include the alteration, substitution, replacement, insertion or deletion of a 

selected amino acid residue in the coding sequence. For example, one or more of the cysteine 

residues may be deleted or replaced with another amino acid to alter the conformation of the 

molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 

well known to those skilled in the art (see, e.g. f U.S. Pat. No. 4,51 8,584). Preferably, such 

10 alteration, substitution, replacement, insertion or deletion retains the desired activity of the 

protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 

1 5 importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRK program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
methodologies may also be easily made by those skilled in the art given the disclosures herein. 

20 Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one more insect exfr^r^on vectors, employing '^t.- <* ■ - : . M 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, Sm Diego, Calif., U.S A. 

25 (the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 

30 culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (Le., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 

35 heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
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hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
5 maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, NJ.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
1 0 available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
1 5 homogeneous isolated recombinant protein. The protein thus purified is substantially free of 

other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. . 

20 Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., teig?ii«g moiety or anotner therapeutic agtanl. Such analogs ^ 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 

25 provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

30 steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 
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Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al, Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
5 University of Wisconsin, Madison, WI), BLAST?, BLASTN, BLASTX, FASTA (Altschul, S J 7 , 
et aL, J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et aL, Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et aL, J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 

10 (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 

15 Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protein the polypeptide according to the invention can 
20 correspond to all or a portion of a protein according to the invention. In one embodiment, a 

fusion protein comprises at least one biologically active portion of a protein according to the -M ^ 

• • • • 

invention. In another embodiment, a fusion protein comprises at least two biologically tictive i 
portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 
25 polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-tenninus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 
the invention operably linked to the extracellular domain of a second protein. 
In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
30 sequences of the invention are fused to the C-tenninus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 
the polypeptide sequences according to the invention comprise one or more domains fused to 
sequences derived from a member of the immunoglobulin protein famil y The immunoglobulin 
35 fusion proteins of the invention can be incorporated into pharmaceutical compositions and 
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administered to a subject to inhibit an interaction between a ligand and a protein of the invention 
on the surface of a cell, to thereby suppress signal transduction in vivo. The immunoglobulin 
fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of the 
ligand/protein interaction may be useful therapeutically for both the treatment of proliferative 
5 and differentiative disorders, e,g., cancer as well as modulating (e.g., promoting or inhibiting) 
cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays to 
identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 
A chimeric or fusion protein of the invention can be produced by standard recombinant 

1 0 DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g. , by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 

15 be synthesized by conventional techniques including automated DNA synthesizers. 

Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 

20 Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned fafe Suc r a an expression vector such that the fasten mcieiy is linked * 
in-frame to the protein of ths invention. 

25 4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of nonnal 
function of the encoded protein. The invention thus provides gene therapy to restore nonnal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 

30 appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 

35 American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 

34 
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the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 
states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

10 Other methods inhibiting expression of a protein include the introduction of antisense 

molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art Further, the polypeptides of the present invention can be 
inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

1 5 The present invention still further provides cells genetically engineered in vivo to express the 

polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

20 Knowledge of DNA sequences provided by the invention allows for modification of cells to 

. permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
hc-uiologoiisrecombination)To provide ^uvax^^polyp^riir expression by rcp;acing,in whole or . 
in part, the naturally occurring promoter with all or part of & heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 

25 operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/1 2650, PCT International PubUcationNo. WO 92/20808, and PCT 
InternationalPublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamy lase, and dihydroorotase) and/or 

30 intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

35 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 

35 
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be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
5 regulatory elements, transcriptionalinitiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 

1 0 which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element Alternatively, the 

1 5 targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 

20 of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
properly of negative rde-iiio:^ such that the negatively selectable irorksr is linked to me exogerr*^: : 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 

25 not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappei; 

30 U.S. PatentNo. 5,578,461 to Sherwinet al.; International Application No. PCT/US92/09627 
(WO93/09222)by Seldenet aL; and International ApplicationNo. PCT/US90/06436 
(WO91/06667) by Skoultchi et aL, each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
5 control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

10 processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 

1 5 polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
promoter can be supplemented by insertion of one or more heterologous enhancer elements 

20 known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
uiKugh, e.g., homologous recombination of knock cut strategics, of :.v that fail to ^pies3^ f w 
polypeptides of the invention or that express a variant polypeptide. Such animals are useful as < 
models for studying the in vivo activities of polypeptide as well as for studyli-j modulators of the 

25 polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vrvo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 

30 control of exogenous or endogenous promoter elements, are known as transgenic animals. 

Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

35 processes, and preferably in disease states. Transgenic animals are useful as model systems to 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 PCTYUS01/04098 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
5 invention promoter is either activated or inactivated to alter the level of expression of the 

polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
1 0 confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of the uses or biological activities (including those associated with assays cited herein) 

15 identified herein. Uses or activities described for proteins of the present invention may be 

provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology will dictate whether the 
polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 

20 inhibitors) thereof would be beneficial to the subject in need of treatment Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, clov^i £.e;aes and degtiicrate variants thereof) ~t 
polypeptides of the invention (including full length protein, mature protein and truncations or ' 
domains thereof), or compounds and other substances ihat modulate the overall activity of the 

25 target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 

30 helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

35 4.10.1 RESEARCH USES AND UTILITIES 
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The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
5 tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 

10 sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 
polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 

1 5 example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 

20 determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled rei'serO assays designed to quantitatively d^nnine levels of the protein (or. s?: $&tc 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is V 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or* **' 

25 development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

30 Methods for performing the uses listed above are well known to those skilled in the art 

References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed, Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

35 
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4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
5 such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 

particular organism or can be administered as a separate solid ofc liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, die 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

10 

4.103 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
15 activity or may induce production of other cytokines in certain cell populations. A 

polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of therapeutic compositions of the present 
20 invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DAI G, Tl 0, B9, B9/1 1 , BaF3, 
MC9/G. M+(preB M+), 2ES, RBr, TX*h 123, Til 65, KT2, CTLL2, TF-l,Mr7e, CMK, ^ 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 
25 in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et aL, J. Immunol. 137:3494-3500, 1986; Bertagnolli et aL, J. Immunol. 
145:1706-1712, 1990; Bertagnolli et aL, Cellular Immunology 133:327-341, 1991; Bertagnolli, 
30 et aL, L Immunol. 149:3778-3783, 1992; Bowman et aL, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
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and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
5 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 

Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVriesetal., J. Exp. Med. 173:1205-1211, 1991; Moreauet al, Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 

10 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 

U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1 -Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. I E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin ' 
9-Ciarletta, A., Giannotti, J., Clark, S. C and Turner, K. J. In Current Protocols in Immunology. 

15 J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 

20 Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
immunek-^iC siudies in Humans); Weinberger et al.",* 7ioc. Natl. Acad. Sci. USA v7:5Ci \ -609^ ; 
1 980; Weinberger et al., Eur. J. Immun. 1 1 :405-41 1, 1981; Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

25 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 

30 germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 

35 proteins which currently must be obtained from non-human sources or donors, implantation of 
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cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 

tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 

cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

5 It is contemplated that multiple different exogenous growth factors and/or cytokines may 

be administered in combination with the polypeptide of the invention to achieve the desired 

effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 

specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

3L), any of the interleukins, recombinant soluble IL-6 receptor fused to DL-6, macrophage 

10 inflammatory protein 1-alpha (MEP-l-alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 

1 5 for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

20 layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (ses .U.S. Patent No. 5 V *°0,926). r , ML*, 

Stem cells themselves can be transfected with a polynucleotide of the invention to induus * 
autocrine expression of the polypeptide of the invention. This will allow for generation of 

25 undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into die desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
identification of differentially expressed genes in stem cell populations that regulate stem cell 

30 proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 

35 genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
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of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
5 to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

10 promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et ah, Differentiation, 48: 173-1 82, (1991); Klug et al., J. Clin. Invest, 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 
Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

15 accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
exhibits stem cell growth fector activity. Stem cells are isolated from any one of various cell 

20 sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in t*cn-bic:;? ion with other growth 
factors or cytokines. The ability of the polypeptide of the invenilon to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 

25 Bernstein et al., Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOEESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

30 biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 
to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

35 growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
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traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 

treat consequent myelo-suppression; in supporting the growth and proliferation of 

megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

various platelet disorders such as thrombocytopenia, and generally for use in place of or 

5 complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 

hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 

hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 

those usually treated with transplantation, including, without limitation, aplastic anemia and 

paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 

10 post irradiation/chemotherapy, either m-vrvo or ex-vivo (i.e., in conjunction with bone marrow 

transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 

as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

Suitable assays for proliferation and differentiation of various hematopoietic lines are 

15 cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et aL, Molecular 
and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 

20 Assays for stem cell survival and differentiation (which will identify, among others, 

proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulot'c colony forming *«s?.ys, Freshney, M. G. In Culture of Hematopoietic VJs. 
Freshney, et aL eds. Vol pp. 265-258, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et aL, 
Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells'^ 

25 with high proliferative potential, McNiece, L K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. L Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 

30 stromal cells, Spooncer, E., Dexter, M and Allen, T. In Culture of Hematopoietic Cells. R. 1. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

35 4.10.6 TISSUE GROWTH ACTIVITY 
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A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
5 circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
1 0 of congenital, trauma induced, or oncologic resection induced craniofecial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
1 5 periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 

Another category of tissue regeneration activity that may involve the polypeptide of the 

20 present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 

other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deibnnities and ct:: cr t^on or ligament defects itf^tlf ■ 
humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 

25 use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 

defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 
other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 

30 provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 
tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 

35 an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of neural 

cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 

involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 

5 composition may be used in the treatment of diseases of the peripheral nervous system, such as 

peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 

system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 

10 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 

resulting from chemotherapy or other medical therapies may also be treatable using a 

composition of the invention. 

Compositions of the invention may also be useful to promote better or fester closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
1 5 insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
20 desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A compos I ?; on of tha present invention may also be useful for girt protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 
25 A composition of the present invention may also be useful for promoting or inhibiting 

differentiation of tissues described above from precursor tissues or cells; or for inhibiting die 
growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in; 
30 International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
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Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 



4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 
5 A polypeptide of the present invention may also exhibit immune stimulating or immune 

suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

10 proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HTV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 
treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 

15 herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 

Autoi mmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

20 rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 

autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 

disss* and autoimmune inflammatory eye disease. Such a protein (or r.r t~£?± -_',:£s thereof, - 1 1, 

including antibodies) of the present invention may also to be useful in the treatment of allergic 

■ . \* 
reactions snd conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 

25 venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 

30 suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1998), skin prick test (Hoffinann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
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test (Vohr et aL, Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
5 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent Tolerance, which involves inducing non-responsiveness or anergy 

10 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent 

Down regulating or preventing one or more antigen functions (including without 

15 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

20 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
ar-d thus acts as an mimusr^ini^^ssant Moreover, a lack of cosnmuletion may also- be sufficienil|P 
to anergize the T cells, thereby inducing tolerance in a subject Induction of long-term tolerance ^ 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 

25 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

30 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et * 
al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA, 89:11102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed, Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

35 compositions of the invention on the development of that disease. 
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Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
5 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 

1 0 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 

15 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 

20 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
-vi>-f- T cells from the patient, costunulatf the T cells in vitro with >4r^ gen-pulsed -J*^! 
APCs either expressing a peptide of the present invention or together with a stimulatory form of ; 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 

25 patient Another method of enhancing atoti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

30 A polypeptide of the present invention may provide the necessary Stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition* tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

35 MHC class I alpha chain protein and ($2 microglobulin protein or an MHC class II alpha chain 
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protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 

proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 

with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 

cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

5 an antisense construct which blocks expression of an MHC class H associated protein, such as 

the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 

of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 

tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 

subject may be sufficient to overcome tumor-specific tolerance in the subject 

1 0 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M Shevach, W. Strober, Pub. Greene Publishing Associates and 
15 Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acai Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
" Immunol 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61 -.1992-1998; Bertagnolli et al., 
20 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, ^lteans.Hjat modvlita T-celi depends antibody responses and that ?||^ 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B eel! function; In vitro antibody production, 
25 Mond, J, J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a, Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. At Kruisbeek, D. H. Margulies, E. 
30 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J, ImmunoL 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et aL, J. ImmunoL 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
35 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
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et aL, J. Immunol. 134:536-544, 1995; Inaba et aL, Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
5 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
10 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
1 5 include, without limitation, those described in: Antica et al., Blood 84: 1 1 1 -1 17, 1 994; Fine et al., 
Cellular Immunology 155:111-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVEV/INHIBIN ACTIVITY 

20 A polypeptide of the present invention may also exhibit activin- or inhibin-related 

activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibin;- rce characterized by their s#.ui!y inhibit the release of follicle '* ^ 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 

25 alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 

30 a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 
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The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
aL, Endocrinology 91 :562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al, Nature 
5 321 :776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 



4.10.9 CSDEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 

10 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 

15 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 

20 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a par^H^Vr i^otein has cksx;otactic activity for a population 6i cells can be r*?'.-: V ■ - ^ 
determined by employing such protein or peptide in any known assay for cell chemotaxfc 
Therapeutic compositions of the invention can be used in the following: 

25 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

30 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

35 
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4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
5 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et aL, J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

15 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 

20 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in +he gene or aosenu, oi the p:>P>^H;de may be asiwiated with a c^nce!^|S \ 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or pcognosis. 

25 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

30 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

35 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
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bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 

carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 

kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 

neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

5 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 

tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 

hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 

inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

10 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

1 5 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

20 with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cyt^Mne HC1 (Cytosine arabmcside), Dacarbazine, lT r '-rinomyrtB^fe^ , 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Btoposide (VI 6-2 1 3), 
Floxuridine, 5-Fluorouracil (5~Fu), Fhrtamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 

25 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mereaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Peatostatin, 

30 Semustine, Terdposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

35 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 
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In vitro models can be used to determine the effective doses of the polypeptide of the 

invention as a potential cancer treatment. These in vitro models include proliferation assays of 

cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 

Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

5 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 

Pilkington et al. a Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 

of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 

cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 

10 Clin. Exp. Metastasis, 1 7:423-9 (1999), respectively. Suitable tumor cells lines are available, 

e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIG AND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
15 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
20 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
:~x =dso useful fox screening of potential peptide or 2s>all molecule Ir&l^rs of tks rdr r ^t 
receptor/ligand interaction. A protein of the present invention (including, without limitation, ' '}* 
fri^prrents of receptors and ligands) may themselves be useful as inhibitors^ receptor/ligand * ^ 
25 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
30 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28. 1 - 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1 989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays! affinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art 
5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 

partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
10 Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or ihodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

15 4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 

20 utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
traisformed cells in v*;-.u£ : titive binding essays. Such cells, ei+hvt in viabh; v£ xx^d lOiro, " 
be used for standard bidding assays. One may measure, for example, lie formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the * 

25 diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art 

Sources for test compounds thai may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 

35 fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
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screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 252:63-68 (1998). 
5 Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 

organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 

10 For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et aL, Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 
Identification of modulators through use of the various libraries described herein permits 

1 5 modification of the candidate "hit" (or "lead'*) to optimize the capacity of the "hif ' to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

20 The binding molecules thus identified may be complexed with toxins, e.g., ricin or 

cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
• - molecule compleA is then targeted to p.tim 

molecule for a polypeptide of the invention. Alternatively, the bindL ^ molecules may be 
complexed with imaging agents for targeting and imaging purpura, . , ■ : * 

25 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 

30 expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of die invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 

35 that modulate (f.e. f increase or decrease) biological activity of a polypeptide of the invention. 
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Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 

ligands, or cocktails of ligands to two cells populations that are genetically identical except for 

the expression of the receptor of the invention: one cell population expresses the receptor of the 

invention whereas the other does not The response of the two cell populations to the addition of 

5 ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

polypeptide of the invention in cells and assayed for an autocrine response to identify potential 

ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 

in the art can be used to identify binding partner polypeptides, including, (1) organic and 

inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

1 0 comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 

1 5 with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

20 4J0.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
a^ti-inflammatory activity may be achkyM by providing & stimulus to ceUs involved in th;* 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chsraotaxis of cells involved in the inflammatory 

25 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

30 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

35 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
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arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
5 intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
10 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., JJB. Lippincott Co., Philadelphia). 

15 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 

20 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or clemyeMnanoh; Nervous jysi^m tew-xi* ^.iiich may created in a patieni (iach^T^SP^ ^ ,,v> 
human and non-human mammalian patients) according to the invention include but are not ; 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 

25 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
30 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

3 5 tuberculosis, syphilis; 
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(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 

injured as a result of a degenerative process including but not limited to degeneration associated 

with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 

sclerosis; 

5 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 
10 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

1 5 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 
20 system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
.-eay of the foUowmg effects may be usefcl, according to the i^ventips; > , ** > ^ 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

25 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

30 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, en2ymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
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assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 

conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 

invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 

5 trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 

muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 

muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 

1 0 poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

(Charcot-Marie-Tooth Disease). 

4J0.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 

1 5 activities or effects: inhibiting the growth, infection or function o£ or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 

20 effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat H^id, protein, carbohydrate, vitamins, minerals, co-factors or other . 
nutritional factors or components); effecting beha\aoral characteristics; include 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

25 (including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 

30 as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein, 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 
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The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the phannacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the QNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
K^sent invention can be used to detect polymorphic. The array < c^-nprise modiiL-J * J 
nucleotide sequences of the present invention in order to detect the nv: ; Jeotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at, 1 983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
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suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1 -5 mg/kg. The control consists of administering PBS only. 
5 The procedure for testing the effects of the test compound would consist of intradermally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
10 would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 



4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
1 5 other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

20 One embodiment of the invention is the administration of an effective amount of the 

polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be mod&i&d hy regulating Lk; peptides cf the inve; ^ou. Whil -i the mode of M 
administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 

25 polypeptides or other composition of the invention will normally be determined by the 

prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient Typically, the amount of polypeptide 
administered per dose will be in the range of about 0.01^ig/kg to 100 mg/kg of body weight, with 
die preferred dose being about O.l^ig/kg to 10 mg/kg of patient body weight. For parenteral 

30 administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingr edient, 

35 The preparation of such solutions is within the skill of the art. 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
5 including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

10 fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

15 M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, EL-10, IL-1 1, IL-12, 
DL-13, IL-1^, IL-15, DFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, tbrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 

20 factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The phar^ Ht^uticai composition msy farcuir contain oilier agents * which either & :>ik £&$ 
the activity of the protein or other active ingrec ^nt or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 

25 composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 

30 hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, EL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
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As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
5 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
10 amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

15 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 

20 hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 
administered rth?v v :ir nultaneous:y with the cytokines), Jy-riphokme^), other hemaicp^OT-i;-i v ^S^ 
factors), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, . : V 
the attending physician will decide on the appropriate sequence of administering protein or othu * 

25 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
30 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
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ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
5 a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
10 afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
1 5 ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide mayi'mal therapeutic benefit 

4.12 2 COMPOSmONS/FORMULAHONS 

20 Pharmaceutical compositions for use in accordance with the present invention thus may 

be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipienls ana aurj^'ici which faciliiaic processing of ihe acr'c compounds kzoW&M 
preparations which can be used pharmaceutical^. These pharmaceutical compositions may be v 
manufa ctured in a manner that is itself known, e.g. , by means of conventional mixing, 

25 dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

30 the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 

35 soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
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pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 

10 active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 

1 5 other vehicle as known in the art The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 

20 barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art 

For jicJ administration, the compounds ivaii be forj£i;ilair& x&&«*y by CGmbitiOig the ' 

active compounds with pharmaceutical^ acceptable carriers well huown in the art Such carriers t 

-am- 
enable the compounds of the invention to be formulated as tablets, pills, uragees, capsules, 

25 liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 

30 preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 

35 purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
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talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 
5 Pharmaceutical preparations which can be used orally include push-fit capsules made of 

gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 

10 suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 

1 5 invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g. , 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount Capsules and cartridges of, e.g., gelatin for use in 

20 an inhaler or insufflator may be formulated containing a powder mix of the compound and a 

suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
ainfeistratioxi by injection, r , oy ?olus injection continuous infiisic~\ f ormulations f& r %0&\ 
injection may be presented in unit dosage form, e.g. 9 in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 

25 emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 

30 vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
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solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
5 glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 

1 0 sparingly soluble derivatives, for example, as a sparingly soluble salt 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 

1 5 polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 

20 co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the faction size of polyethylene glycol may be varied; other 
bicrmp^;3Ie polymers may replace polycthylem glycol, e.g. polyvinyl pyrc:^ and othi^^fe ' ^ 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for ^ . 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsion are well > 

25 known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent 
Various types of sustained-release materials have been established and are well known by those 

30 skilled in the art Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

35 or excipients. Examples of such carriers or excipients include but are not limited to calcium 
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carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 

provided as salts with pharmaceutical^ compatible counter ions. Such pharmaceutical^ 

acceptable base addition salts are those salts which retain the biological effectiveness and 

5 properties of the free acids and which are obtained by reaction with inorganic or organic bases 

such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 

monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 

the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 

10 protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related protons including 

15 those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 

20 pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the preset invention is combined, in ac-dit&pi to other pterraaceutically ^ 
acceptable carriers, with amphipathic agents siich as lipids vsdiich exist in aggregated foiin as 
micelles, insoluble monolayers, liquid crystals, or I cellar layers in aqueous solution. Suitable r 

25 lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for examplej in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

30 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 

35 attending physician will administer low doses of protein or other active ingredient of the present 
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invention and observe the patients response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
5 contain about 0.01 ^g to about 100 mg (preferably about 0.1 \ig to about 10 mg, more preferably 
about 0. 1 \xg to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 

10 composition for use in this invention is, of course, in a pyrogen-firee, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 

1 5 described above, may alternatively or additionally, be administered simultaneously or 

sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 

20 capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 



The choice of matrix material is based on biocoir.j>?.iiif?j; vy, biodegrsdabiiity, mechanically* . . 
properties, cosmetic appearance and interface properties. The particular application of the 

s compositions will define the appropriate formulation. Potential matrices for the compositions 

25 may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 

30 aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

35 glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
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In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
5 (including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 

10 The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

1 5 compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-P), and 
insulin-like growth factor (IGF). 

20 The therapeutic compositions are also presently valuable for veterinary applications. 

Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients ibr such tresErKt- with proteins or other active ingredients of the present invention T^P 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 

25 modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, die condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 

30 other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
35 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
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mammalian subject Polynucleotides of the invention may also be administered by other known 

methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 

the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

proteins of the present invention in order to proliferate or to produce a desired effect on or 

5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 

10 intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 

15 appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC50 as determined in cell culture (ie., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 

20 Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioiafeu>f symptoms or a prolongation ci ; arviyal 1^ ap^*f^;t Toxicity and + hf^peutic. 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell '> 
cultures or experimental animals, e&, for determining the LD50 (the ;?o§s lethal to 50% of the It 

25 population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and EDso- Compounds which exhibit high therapeutic indices are preferred 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 

30 of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 

vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact fonnulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., FingI et aL, 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 

35 individually to provide plasma levels of the active moiety which are sufficient to maintain the 
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desired effects, or minim a l effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 jxg/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 |ag/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

1 5 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

20 The compositions may, if desired, be presented in a pack or dispenser device which may 

contain one or more unit dosage forms containing the active ingredient The pack may, for 
raunple, comprise metal c# phsSo fbil, such as a blister pack. Ihe p^Sr. c;- dispenser dewceln 
be accompanied by instructions for administration. Compositions comprising a compound of the 
invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 

25 appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

30 immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab, F^ and F (ab ^2 
fragments, and an Fab expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 

35 by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 

74 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 PCT/US01/04098 
such as IgGi, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
5 portion or fragment thereof, and additionally can be used as an immunogen to generate 

antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 

1 0 of the full length protein, such as an amino acid sequence shown in SEQ ID NO:985, and 

encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 

1 5 epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 

20 indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plot;- ^Gv-kg regions ofLj Jrophilicity and hydrophobicity ** ' 

may be generated by any method well known in the art, including, for example, the Kyte 
Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 

25 Hopp and Woods, 1981, Proc. Nat Acad Sci. USA 78: 3824-3828; Kyte and Doolitde 1982, /. 
Mol Biol 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

30 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
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Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 



5.13.1 Polyclonal Antibodies 

5 For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

1 0 recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant Various adjuvants used to increase the immunological response include, but are not 

1 5 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

20 synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from'the bk& J) and farther pi^L; - u by well kt^T. techniques^ 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 

25 target of the immunoglobulin sought, or an epitope thereof may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

30 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition 11 , as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product In particular, the complementarity determining regions (CDRs) of the monoclonal 

35 antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
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binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared vising hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, 
5 hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 

10 are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 

1 5 Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 

20 medium"), which substances prevent the growth of HGPRT-deficient Gells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selee&d a^LSody-producing cells, and are ^ensitf - e. U» a mcdr um-™^ 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which i 
can be obtained, for instance, from the Salk institute Cell Distribution Center, San Diego, 

25 California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol.. 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications. Marcel Dekker, Inc., New York, (1 987) pp. 
51-63). 

30 The culture medium in which the hybridoma cells are cultured can then be assayed for 

the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 

35 art The binding affinity of the monoclonal antibody can, for example, be determined by the 
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Scatchard analysis of Munson and Pollard, Anal. Biochem.. 107:220 (1 980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
5 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

10 example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 

15 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 

20 monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 

example, by substituting the coding sequence for human heavy and light chain constant domains 
fcplacr-?i'!M£?*om (U.S. Patent No. 4,316 J67; Mor?ffl,:'!,-g vture 36$ 0* : : ^ j 

812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence f;*S or part of theX " 
coding sequence for a non-immuno globulin polypeptide. Such a non-immunoglobulm : 1 

25 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13 2 Humanized Antibodies 

30 The antibodies directed against the protein antigens of the invention can further comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , F(ab ? >2 or other antigen- 

35 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
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immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature. 321 :522-525 (1986); Riechmann et al., Nature. 332:323-327 (1988); Verhoeyen et al., 
Science. 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol.. 
2:593-596 (1992)). 

5.133 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
liybridoma technique .'see Kozbov, et al., 1933 Lr-i;uu^ Today 4: 7?.) uad ihe EBV hyferidor^ 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy; Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J» Mol. Biol.. 227:381 (1991); 
Marks et al., J. MoL Biol.. 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
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is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al (Bio/Technology 10. 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368. 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-5 1 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 
5 Lonberg and Huszar (Intern Rev. Immunol 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

1 0 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

1 5 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 

20 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 

immunoglobulins with human variable regions can be recovered and expressed to obtain the 
ayr ^c-^^: directly, or uan be further modified tc; obtain ancuogs Si audbou- ^ u zs, for - "^ 
example, single chain Fv molecules. * 
An example of a method of producing a nonhuman host, exemplified as a iiic U3e, lacking 

25 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker, 

30 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

35 an expression vector containing a nucleotide sequence encoding a light chain into another 
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mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 
5 immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
10 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal Fab fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
1 5 may be produced by techniques known in the art including, but not limited to: (i) an F(ab^2 

fragment produced by pepsin digestion of an antibody molecule; (ii) an Fab fragment generated 
by reducing the disulfide bridges of an F(ab72 fragment; (iii) an Fab fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

20 5.13.5 Bispecific Antibodies 

Bispecific antibodies* are monoclonal, preferably human or humanized, antibodies that 
have binding Specifidties for at .least two iufFei?£+ w&^-jns: ' In the prssaat case, cue cr the — 
binding specificities is for an antigenic protein of ths invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit 

25 Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature. 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 

30 potential mixture of ten different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et al 9 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 

35 combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 PCT/DS01/04098 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
5 light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al., Methods in Enzvmology. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 

10 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 

15 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

20 prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments are reduced is ihz p^ence of the dithiol completing ager* ac-dium ar. iy 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to tiiio^troben2»ate (TNB) derivatives. One of the Fab'-TNB 

25 derivatives is then reconverted to the Fab' -thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab* fragments can be directly recovered from E. coli and chemically 

30 coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(aV)2 molecule. Each Fab* fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 

35 of human cytotoxic lymphocytes against human breast tumor targets. 
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Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et ah, J. Immunol. 148(5): 1547-1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
5 different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Proc. Natl. Acad Sci, USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 

1 0 heavy-chain variable domain (V «) connected to a light-chain variable domain (VO by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the V H and V L domains of one fragment are forced to pair with the complementary Vl and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 

15 reported. See. Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 

20 immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (er>64), FcyRli (Ci>32) and ;?73:£;; fCDIS; so ^ 10 focus cettilai^^ 
defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also t< 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 

25 possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

30 Heteroconjugate antibodies are also within the scope of the present invention. 

Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
. No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 

35 protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
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can be constructed using a disulfide exchange reaction or by forming a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mereaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5 5 .13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 

10 internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 

15 has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Imm unoconj ugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 

20 cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radiocorjugate). \ * - ^ 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been r* 
described above. Enrymaticaliy active toxins and fragments thereofthat can be used include 

25 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarein, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 

30 radionuclides are available for the production of radioconjugated antibodies. Examples include 
212 Bi,"V^V 90 Y,and 18fi Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 

35 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
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compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis<p-dia^niumbenzoyl)^ylenediamine), diisocyauates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
5 Carbon-14-labeled l»isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
10 administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent 

4.14 COMPUTER READABLE SEQUENCES 

IS In one application of this embodiment, a nucleotide sequence of the present invention can 

be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 

20 and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 
be used : ;o create a manufacture comprising ^ompu^r isadat-b l^wtii om having .leaded thereof 
a nucleotide sequence of the present invention. As used herein , "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan e-^a readily adopt any of the 

25 presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 

30 to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 

35 Oracle, or the like. A stalled artisan can readily adapt any number of data processor structuring 
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formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954 or a representative fragment thereof; or a nucleotide sequence at least 95% 
5 identical to any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954 in computer readable form, a skilled artisan can routinely access the sequence 
information for a variety of purposes. Computer software is publicly available which allows a 
skilled artisan to access sequence information provided in a computer readable medium. The 
examples which follow demonstrate how software which implements the BLAST (Altschul et 

10 al. f J. MoL Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 
(1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
useful in producing commercially important proteins such as enzymes used in fermentation 
reactions and in the production of commercially useful metabolites. 

15 As used herein, "a computer-based system" refers to the hardware means, software 

means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 

20 computer-based systems are suitable for use in the present invention. As stated above, the 

computer-based systems of the present invention comprise a data storage means having stored 
therein a iiuclcotidc seqv&:c&'>: the prssen% invention and Ac necesf:^ry hardwire meiuis uud : V?| 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present . " * * ' ' 

25 invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 

30 fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 

35 skilled artisan can readily recognize that any one of the available algorithms or implementing 
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software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motifs refers to any rationally 
selected sequence or combination of sequences in which the sequences) are chosen based on a 
three-dimensional configuration which, is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitaMe for use is thes* methods nr^p^&rabJy 20 ^ -10 bases in length and ^ 
designed to be complementary to a region of the ger ^ involved in transcription (triple helix - see 
Lee et al., NucL Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervari 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof in a test sample, using a nucleic 
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acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
5 for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 

1 0 detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

15 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

20 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
^T>plific?tioii or iumiUnoio^M r^say fonnate ua^ readily be adapted to- esiploy the nucldeiisf^ 
probes or antibodies of the p vsent invention. Examples of such assays can be found in Chard, 1 
T., An Introduction to Radioimmonoassay and Related Techniques, Elsevier Science Publishers, " 

25 Amsterdam, The Netherlands (1986); Bullock, G.R. et aL, Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, lie Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 

30 sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and ean be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 
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In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 



'c;"7 Kl'^DICAL IMAGING ' •/'<'- * ^. V-tfP 

The novel polypeptides and binding partners of the invention are useful in medical - ^. 
imaging of sites expressing the molecules of the invention (e.g., where the polypejide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 
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1-984, 1969-2952, 3937-3942 or 3949-3954, or bind to a specific domain of the polypeptide 
encoded by the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) detennining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 



Compounds identiii via suc> methods - "£tt include compounds which m/>dulate the ^ : j^j 
activity of a polypeptide of the invention (that ; : s, increase or decrease its activity, relative to 
activity observed in the absence of the compound) Alternatively, compounds identified via such 5 ^ 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
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As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
5 antipeptide peptides, for example see Hurby et aL, Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, WiL Freeman, NY (1992), pp. 289-307, and 
Kaspczak et aL, Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 

10 of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 

1 5 by binding to DNA or RN A. Such agents can be based on the classic phosphodiester, 

ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 

20 Lee et aL, Nucl. Acids Res. 6:3073 (1979); Cooney et aL, Science 241 :456 (1988); and Dervan et 
aL, Science 251 :1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(lS^i); Oligodeoxynuclcot^fis as Antisense Inhibitors Expression, CRC Press, Bora <fi 

Raton, FL (1988)). Triple helix-formation optimally resu: ^ in a shut-off of RN A transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 

25 polypeptide. Both techniques have been demonstrated to be effective in model systems. 

Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent Agents which bind to a protein encoded by one of the ORFs of the 

30 present invention can be formulated using known techniques to generate a pharmaceutical 
composition, 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
35 hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
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hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the 
corresponding gene is only expressed in a limited number of tissues, a hybridization probe 
derived from of any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
5 3949-3954 can be used as an indicator of the presence of RNA of cell type of such a tissue in a 
sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 

1 0 PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 

15 are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 

20 chromosome using well known genetic and/or chromosomal mapping techniques. These 

techniques incfode in situ hybridization, linkage analysis against known chromosomal markers, ^ ; 
hyl;lUkation screening with libraries* flow-sorted chromosomal prsparatk :is specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 

25 Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265: 198 If). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 

30 predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 
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420 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides,!^., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 
5 Support bound oligonucleotides may be prepared by any of the methods known to those of 

skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata^/ai, 1985;Dahlenef a/., 1987; Morrissey& Collins, (1989) Mol. Cell 

10 Probes3(2) 189-207) or by covalent binding of base modified DNA (Keller et al t 1988; 1989); all 
references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotm-streptavidin 
interaction as a linker. For example, Broudee/ a/. (1994) Proa Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 

1 5 streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, EL) is also selling suitable material that could be used. Nunc 
20 Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed CovalinkNH. CovaLinkNH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for ruithsr co v:«_' c^-'pling. CovaLink Modules may t58^C\ 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
25 (Rasmussene/o?., (1991) Anal. Biochem. 198(1) 138-42). 

The use of CovaLinkNH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussen et al., (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 65 13-29). This is beneficial as immobilizationusing 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
30 CovaLinkNH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLinkNH via an phosphoramidatebond, the oligonucleotide terminus must have a 5 '-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 
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More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 1 0 rmn. at 95°C and cooling on ice for 1 0 min. Ice-cold 0. 1 M 1 -methylimidazole, 
pH 7.0 (l-Melm?), is then added to a final concentration of 10 mM 1-Melm7. A ss DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 
5 Carbodiimide 0.2 M 1 -ethyl-3^3-dimethylanainopropyl)-(^bodiimide (EDC), dissolved in 

10 mM 1-Melni7, is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

10 It is contemplated that a further suitable method for use with the present invention is that 

described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support The oligonucleotide is then synthesized on the supported 

1 5 nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionraay be 

20 employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodoref al (1991)Science 251(4995)767-73, incorporated herein by reference. Probes may also 
be Irnz^iobilized on nyion supp; ritS'is* Ascribed by V ail Ness 'et al, (1 99 i) Nv*le*.c AtikLs Res. f '^fe 
1 9(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1 988) Anal. Biochem.^ 
169(1) 104-8; all references being specifically incorporated herein. 

25 To link an oligonucleotideto a nylon support, as described by Van Ness et al (1991), 

requires activation of the nylon surface via alkylation and selective activation of the 5 -amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al. t (1994) PNAS USA 91(11) 5022-6, incorporated 

30 herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilizedoligonucleotideprobes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5-protectedi^-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotideprobes may be 

35 generated in this manner. 
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421 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrooke/ai (1989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-923). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplificationmethods. Samples 
may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 924-928 of Sambrook et 
aL (1989), shearing by ultrasound and NaOH treatment 

Low pressure shearing is also appropriate, as described by Schriefer et aL (1 990) Nucleic 
Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cvtfl, described by Fitzgerald et aL (1 992) Nucleic Acids Res. 
< \. ; ..14) 3753-62. These authors described ?n approach ibr the rapid [rgzv&rg ^on and frac:;onati<H?5 
ofDNA into particular sizes that they contemplated to be suitable for she cloning and . 
sequencing. 

The restriction endonuclease CvzJI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme (CV£JI* *), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC19 (2688 base pairs). Fitzgerald et aL (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a CWJI** digest of pUC19 that was size 
fractionatedby a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that CviJI** restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
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ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
(prickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselectednumber of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
&iarraysinay con + ?.fc 64 samples, ore from each patient. Where the 96 sub^-ray? are identirV ; $M 
dot span may be 1 oca 2 and there may be a 1 mm space between subarrays. 

. Another approach is to use membranes or plates (available from NUNC, Naperville, Dlinoif 1 }" 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. Tie 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
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variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 
All references cited within the body of the instant specification are hereby incorporated by 
5 reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
1 0 human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
1 5 into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
20 Amplification of cDNA Ends) was performed to fixrther extend the sequence in the 5 * direction. ? 

• • ; • • 

5.2 EXAMPLE 2 
Assemblage of Novel Nucleic Adds 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 1 969-295 1 , 
and 3949-3954 were assembled using an EST sequence as a seed. Then a recursive algorithm was 

25 used to extend the seed EST into an extended assemblage, by pulling additional sequences from 
different databases (i.e., Hyseq's database containing EST sequences, dbEST version 114, gbpri 
1 14, and UniGene version 101) that belong to this assemblage. The algorithm terminated when 
there was no additional sequences from the above databases that would extend the assemblage. 
Inclusion of component sequences into the assemblage was based on a BLASTN hit to the 

30 extending assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Tables 6 and 8 sets forth the novel predicted polypeptides (including proteins) encoded by 
the novel polynucleotides (SEQ ID N02953-3936, and 3949-3954) of the present invention, and 
their coirespondingnucleotide locations to each of SEQ ID NO: 2953-3936 and 3955-3960. Tables 
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6 and 8 also indicates the method by which the polypeptide was predicted. Method A refers to a 
polypeptide obtained by using a software program called FASTY (available from 
http://fastabioch.virgmia,edu) which selects a polypeptide based on a comparison of the translated 
novel polynucleotide to known polynucleotides (W JR. Pearson, Methods in Enzymology, 1 83 :63-98 
(1 990), herein incorporated by reference). Method B refers to a polypeptide obtained by using a 
software program called GenScan for human/vertebrate sequences (available from Stanford 
University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic 
model of gene structure/compositional properties (C.\Burge and S. Karlin, J. MoL Biol., 268:78-94 
(1 997), incorporated herein by reference). Method C refers to a polypeptide obtained by using a 
Hyseq proprietary software program that translates the novel polynucleotide and its complementary 
strand into six possible amino acid sequences (forward and reverse frames) and chooses the 
polypeptide with the longest open reading frame. 

53 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), full length gene cDNA sequences 
and their corresponding protein sequences were generated from the assemblage. Any frame shifts 
and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences are shown in the 
Sequence Listing as t'fv} ID NC:1-351.„ The amino acid* sre SEQ ID 110:985-1335. 4 

Table 1 shows the various tissue sources of SEQ ID NO: 1-351. - 

The nearest neighbor results for SEQ £D NO: 1-351 were obtained by a BLASTP version * 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 1-351 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 1-351 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanfoid, CA) (Wu et al., J. Comp. 
Biol, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 
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Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
5 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VL1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
10 Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 7 shows the position of the signal peptide in each of the polypeptides 
1 5 and the maximum score and mean score associated with that signal peptide. 

5.4 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
20 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons v/vrs corrected by h?™J editing, During editing, tte sequence was 
checked usingFASIT and/or BLA^^ 117, " c 

UniGeneversionll7,Genpeptreleasell7). OthercompiterprogramsMirichmayhavebeenused * — 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
25 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 352-766. The corresponding 
amino acids are SEQ ID NO: 1336-1750. 

Table 1 shows the various tissue sources of SEQ ID NO: 352-766. 
The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLASTP 
30 version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 352-766 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs with 
identifiable functions for SEQ ID NO: 352-766 are shown in Table 2 below. 
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Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
5 the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
10 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
1 5 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
20 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

• ' . . ;■■ 

5.5 tXAMXMS 
Novel Nucleic Adds 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
25 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 18, gb pri 1 1 8, 
UniGene version 118, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
30 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 767-930. The corresponding 
amino acid sequences are SEQ ID NO:1751-1914. 

Table 1 shows the various tissue sources of SEQ ID NO: 767-930. 
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The homology results for SEQ ID NO: 767-930 were obtained by a BLAST? version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21(Derwent), using BLAST algorithm. The nearest neighbor result showed the homologs for 
SEQ JD NO: 767-930 from Genpept. The translated amino acid sequences for which the nucleic 
5 acid sequence encodes are shown in the Sequence Listing. The homologues with identifiable 
functions for SEQ ID NO: 767-930 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
1 0 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
IS the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

20 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht Soren Brunak, and Gunnar von Heijne in the 
publication 4 ' ; Identification of proka? y ?£u and eukaryotic signal ^c^-v; ♦ .: \wd prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

25 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.6 EXAMPLE 6 
Novel Nucleic Acids 

30 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 8, gb pri 118, 
UniGene version 118, Genpept release 1 1 8). Other computer programs which may have been used 
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in Hie editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 931-965. The corresponding 
amino acid sequences are shown in SEQ ID NO: 1 91 5-1 949. 
5 Table 1 shows the various tissue sources of SEQ ID NO: 93 1 -965. 

The nearest neighbor results for SEQ ID NO: 93 1-965 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 931-965 from Genpept . The translated amino acid sequences for 
10 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 931-965 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
15 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
20 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nv?}fr >u-^ sequence Lhin the sequences that ~ocss for signal peptide ser,ueivi*;c *hiM 
their cleavage sites can be determine from using Neural Network SignalP VI . 1 program : from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

25 for identifying prokaryotic and eukaiyotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
- cleavage sites" Protein Engineering, Vol. 1 0, no. 1 , pp. 1 -6 (1 997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

30 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.7 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 



shifts and incorrect stop codons were corrected by hand editing. Dining editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 9, gb pri 1 19, 
5 UniGeneversionll9,Genpeptreleasell9). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ED NOS:966-974. The corresponding 
amino acid sequences are SEQ ID NO:195(M958. 

1 0 Table 1 shows the various tissue sources of SEQ ID NO: 966-974. 

The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 966-974 from Genpept . The translated amino acid sequences for 

15 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 966-974 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 

20 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 



Ua.;ag ilis pFam sc rv^rc T^ogram (Scr>ahammer et a!., Nuclsfc Acids Res,, Vol. ^6(1 /^fl 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
25 the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

30 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

35 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
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each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5,8 EXAMPLE 8 
Novel Nucleic Acids 

5 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb pri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 

10 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS :975-984. The corresponding 
amino acid sequences are SEQIDNO:1959-1968. 

Table 1 shows the various tissue sources of SEQ ID NO: 975-984. 

1 5 The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 21, 2000 
release (Dement), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 975-984 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

20 with identifiable functions for SEQ ID NO: 975-984 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Corns 
Biol., Vol. 6;gp^219-23S (1999) herein ihtotyOiuted by reference), all the sequences 
examined to determine whether they had idcmiftable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

25 the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 

30 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
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disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunals; and Gunnar von Heijne in the 

publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 

cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

5 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5.9 EXAMPLE 9 
Novel Nncleic Acids 

10 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gbpri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 

15 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:3937-3942. The 
correspondbgpeptide sequence is SEQ ID NO: 3943-3948. 

Table 1 shorn the various tissue sources of SEQ ID NO: 3937-3942. 

20 The nearest neighbor results for SEQ ID NO: 3937-3942 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpspt release 120 and Geneseq October 12, 2000: v 
release 21 (Derwent), using BLAST mgcrithm. The nearest neighbor ;^ alt showed ihe closest 
Jhomologue for SEQ ID NO: 3937-3942 from Genpept . The transls&d amino acid sequences for*" 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

25 with identifiable functions for SEQ ID NO: 3937-3942 are shown in Table 9 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 10 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

30 the eMatrix p-value(s) and the position(s) of the signature within' the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res,, Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 1 1 shows the name of 
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the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP V 1 . 1 program (from 
5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
10 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 12 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

1 5 Tables 5 and 1 3 are correlation tables of all of the sequences and the SEQ ID NOS. 



TABLE 1 



Tissue Origin 


RNA 
Source 


Library 
Name 


SEQ ID NOS: 


lung 




•• 


3 1125 49 65 75114 141 156160 172 
190198 209217 224 229 234-235 267 
269 274 277 282 284 303 308 3 12 320 
334 336 352 372 396 398 412 4i< ~ fi- 
453 464 470 481 492-»j A- 308-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


adult brain 


GEBCO 


AB3001 


1 3 12-13 16 22-24 28-29 41 48 58 65 78 
82 89-90 94 97 103 1 12 1 14-1 15 1 17 120 
122 130-131 168 181 184 186-187 189- 
190 198 208 216 247 249 259 270 277 
297 301 308 312 314 321 333 348 374 
396 403 406 410 412 416-417 420 423 
426-427 431 456 474 481 484-485 488 
498 500 508-509 530 549 553 558 563- 
564 583 596 602-603 608 612 621-622 
624 643 650 674 699 711 736 738-739 
753 770 779-780 785-786 802-803 816 
822 839 842 848 859 861 871 893-894 
897 900 903 925 954 958 967 969 


adult brain 


GIBCO 


ABD003 


3 19 21-25 28-29 31 33-34 37 39 41 46-48 
53 58 63-64 66 72 78 80 99 103 109-1 10 
112 114 118 120-124 126 132-133 135 
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139 143 146 148-149 159 163 168 1 74 
176 179-180 184-185 188-190 202 208- 
209 216-217 221 223 230 234-235 240 
244 249 251 253 255 258-259 263 269- 
270 277 282 285-286 290 294-295 297 
301-302 304-305 307-308 311-312 314 
320 329 333 335-336 342 344 346 349 
354 358 365 370 373-374 377 380 382- 
383 388 394-396 399 401-402 406 409- 
410 413 416 420-421 425 428 430-431 
436-437 442 456 462 464 466-467 474 
484 486 495-496 500-501 506 508-509 
519 530 537 542 549 561-562 564 572 
574 577-578 580-583 586-587 589 592- 
593 596-597 601 608 610 612-614 617- 
624 630-632 635 637 650 658 663-664 
668 676 679 681 689-690 693 699 724 
726 732 736 742-743 747 767-770 780 
784 789 793 799 802-805 813 817-818 
822 824 829-831 837 839 845 848 856 
859-860 864 871-872 875-876 881 887 
896-897 901 903 907 910-911 925 930 
933 943-944 947 952-953 958 962-963 
965 967 972 977 


adult brain 


Clontech 


ABR001 


3 53 66 113 115 126 135 160 172 179 185 
204 263 273 305 312 323 358 380 383 
395-396 403 420 428-429 431 461 542 
583 586 606-607 61 1 620 645-646 688 
690 715 732 736 740 748 754 768 784- 
786 790 796 800 878 897 906-907 947 
977 


adult brain 


Clontech 


A3K006 


59, 32 45 53 SO 72 91 103 118 IZj 130- 
131 134 184 224275 338 350 354 36i : 
363 374 384 390 394 396 431-432 434- 
*35445 468 549 621 732 734-736 745 
760-761 764 768-769 775 787 806 81 1 
818 887 903 906 918 930 942 947 957 
973 977 


adult brain 


Clontech 


ABR008 


2-3 9-1 1 14 17 21 23-25 28-29 31-35 37 
41-42 45 47-48 56-57 65-66 69-70 72 75 
77-78 88 91-92 97-99 101 103 112-115 
118-128 130-131 135 138-140 142 144- 
146 148 152 156-157 159-160 163 168 
172 174 176 178-180 182-190 194 196- 
198 200-201 204 209-214 218 220-225 
228-230 232-233 238-240 243-244 246 
254-256 260-264 270 272-274 278-279 
282-285 289-291 293-294 296-297 301 
303-306 312-314 317 321-322 325-328 
334 336 338 340-342 344 346 348 350- 
352 354 356-358 363 366 369-374 376 
379-381 383-386 388-394 398-399 402- 
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403 405 409-412 414 418-421 423-424 
426-427 430 433-437 443 445-450 452 
456-457 460 462 464 471 479 482-483 
485 488 490-498 505 507 510 516 519- 
522 524 527-532 535 538-539 542-545 
548 551 553 555 561-562 566 569 571 
574 580-583 588-589 593 597 601-608 
611-612 614-615 617-618 621-622 624 
630-635 642 644 646-648 650-652 655 
657 659-661 664-665 668 672 674 689 
693-699 701-702 708 711 715 717 724 
728-730 732 734-735 738-740 745 747- 
750 753-755 757 761 763-764 766-769 
772-773 775 780-781 789-791 793-795 
799-800 802-806 809 812 818-819 821- 
822 826 829-830 832 834-835 841 843 
845 856 858-859 861 864 866 870 872 
876 880 883 885 887 893-898 902 906- 
916 918 921 925-926 930-931 933 942- 
943 946 948 950-951 953-954 958-960 
962-965 967 969-970 972 977 


adult brain 


Clontech 


ABR011 


57 196 270 304 344436 834 


adult brain 


BioChain 


ABR012 


14 82 121-122 168 691 


adult brain 


Invitrogen 


ABR013 


72 108 263 270 336 425 492-494 732 787 
790 826 880 


adult brain 


Invitrogen 


ABR014 


293 394 399 764 768-769 928 967 


adult brain 


Invitrogen 


ABR015 


738-739 764 


adult brain 


Invitrogen 


ABR016 


320 374 396 399 405 684 742-743 767 
931 947 967 


adult brain 


Invitrogen 


ABT004 


21 33-34 37-38 47 52 57-58 69 72 91-93 
109 119 122-124 126-127 135 142-143 
1.58 167-i& 135-i38. If4 VX-V 12 232 
242 246 255 258 270 277 279 293 301 
312-313 319 322-323 331 ->*1 346 348 
371 374 388 391 394 399 401 4QP 411' 
429 436-437 456 462 477 488 496 498 
510 512 515 539 542 545 549 559 563 
573 579 587 589 601-605 612 620-621 
624 640 643 647 681 715 723 728 732 
735-736 740 745 748 753 766 785-786 
792-793 797-801 812 822 829-831 853- 
856 859 876-877 884 893-894 908-909 
918 925 933 950 969978 


cultured 
preadipocytes • 


Strategene 


ADP001 


428-29 6993 114 121 132-133 135 151- 
152 159 167 172 178 181 184 190 194- 
195 203-204 209 217 219 240 248 260- 
262 267 273-274 277 282 297 301 304 
312 314 326-327 361-362 371 374 388 
394 401 403 405 41 1 420 437 453 466- 
467 470 474 478 496 507-509 517 530 
532-533 584 588 593 602-603 608 610 
617-621 630-631 633 639 642-643 661 
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693 729 746 761 765 769 834 842 848 
887 907 923 947-950 957 967 969 


adrenal gland 


Clontecb 


ADR002 


1 3 12-13 21 23-24 27-29 67 74 78 103- 
105 108-109 113 115 118 120-121 128- 
133 149 156 160 172 177 182 214 217 
223 232-233 247 254 269-270 273-274 
277 283 285 288 298-299 308 317 319 
328 338 340 342 361-362 364 372 376- 
377 382 384 401-402 405-406 416 420 
431 437 444 446 448 457 462 484 500 
507 517 524 532-533 539 545 554 561- 
562 564 588 597 602-603 606-607 635 
642 646 649 658 664 674 693 703 730 
740 745 752 759 765 767 775 779 799 
809 817-818 839 845 856 859 863 887 
890-891 896 948 953 958 961-963 973 


adult heart 


GIBCO 


AHR001 


1 3-4 8 10 14 20-21 25 28-29 33-34 37-38 
41 48 54-57 65 69-72 75 78 80 82-83 97 
99-100 108 112-115 117-121 123-124 
128-133 141 144-146 149 152 159 162- 
163 168 172 176 179 181 184 186-187 
190-191 201 203 208-209 212 216-218 
221 223 227 229 233 244 247 249 253- 
255 258 263-264 267 269-270 274 278 
280-282 285 289 291 295 297-299 301 
303-304 308 313 317 321-322 326 328 
334 344 348 352 358 361-363 370-371 
380 382-383 388 394-396 398 401 403 
405-406 410-416 423 425-427 430-431 
436 452-453 464-465 470-474 481-484 
487-488 490 492-494 496 499-500 505- 
-506 5G8-509 51 A 523 529-530 5.33 547- 
548 553 558 563-565 577-578 585-58?. 
590 593 597 601-603 606-608 610-613 
617-619 621-622 626-628 637-638 642- 
644 652 658 661 672 682-683 688 691 
693 697 699 708 711 713 715 732 737 
745 747-748 750-753 759 761 765 768- 
770 775 790 802-803 814-815 818-819 
830 837 839-840 842 845 848 859 861- 
862 867 876-877 887 891-892 896 900- 
901 903 905-906 908-909 919-920 922 
925 928 936 939-940 946-947 950 953 
959 967 970-971 973 977 


adult kidney 


GIBCO 


AKD001 


1.3 8 12-14 17 19-25 28-29 33-34 37-39 
41 46-48 50 52 55-60 62 65-67 69 71-72 
75 77-78 82 84 89-90 93 97 108-110 114- 
116 118-121 123-125 128 130-133 135 
138 144 146 149 156 159-161 163-164 
167-172 176 179 184 186-187 189-190 
194 196 200-202 204 209 21 1-212 216- 
217 219 221 223-224 229 232-235 244 
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247 250 253 255-256 258 263-264 268- 
272 274 277-281 283 286 288-290 292 
294-295 297 301 303-309 311-314 316 
319-323 325 328-338 342 348-349 352 
354-355 358 361-363 365 370-371 373 
376-378 380 382-383 388 395-399 401- 
403 405-406 409-413 416 418-420 425- 
428 430-431 440 442 452454 462 464- 
465 470 472-474 477 479 481 483-485 
487489 492495 498-500 504 506 510 
517 522 525 529-530 532-533 539 542- 
543 547 551-552 558 560-564 569-570 
573-574 577-578 580-583 585-590 594- 
596 601-608 610-613 617-621 624 626- 
628 630-631 634-636 639 642-643 648 
652 656 658 664-665 676-677 679 681 
688-691 693 697 699 708 711 715 717 
720-722 724 729-732 738-741 747-748 
751-753 761 765 770-778 780 784 789 
791 793 797 804 813 817 823-824 834 
837 839 842-843 845 848 859 861-862 
864 867 870 876-877 887 889 892-894 
896-897 900-901 903 907 913-915 918 
921 923 925 929-930 932 939 942 946- 
947 949-950 953 958-959 961-963 967 
969 972 977 


adult kidney 


Invitrogen 


AKT002 


1 3 16 21 30 32 35 3841 464 7 56 77 92 
109 123-124 130-131 146 149 161 167- 
168 172 176 190 209 212 234-235 258 
279 292 301 303 308 314 333 355 363 
372 380 383 396 399 402 418419 426- 
427 431 448 454 m 471-'74 4884$\ 
495 m 504 506 508-509 520-52.1 530 :. . 
537 539-541 545 547 563 582-583 :W'2 
613 617-618 621 623-624 633 655 688 
690 693 699 704 713 732 745 752-753 
761 766-768 770 784 789 797 837 842 
848-849 866-867 877 887 893-894 903 
914-915 925 929-930 937 944-945 947- 
949 955 961 967 984 


adult lung 


GIBCO 


ALG001 


1 3 14 18 28-29 38 54-56 59 92 110 114- 
115 130-131 146 149 156 159 164 167 
176 184 209 217 234-236 240 255-256 
258 263-264 269 271 276 280-281 297 
305 308 312 314 322 325 332 336 344 
353 361-362 388 401 410 420-421 426- 
427 431 465 469 474 484 498 500 506 
508-509 517 530 532 573 592 596 613 
619-620 623 626-628 638 658 679 681 
684 689 717 731 741 771 791 799 817 
834 845 861-862 864 875-876 901 921 
925 928 932 940 947 949 959 962-963 
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967 


lymph node 


Clontech 


ALN001 


3 10 110 146 160 168 196 209 221 269 
278 301 336 348 394 405 411 420 422 
459 464 474 485 503 506-507 532 563 
582 619 623 630-631 642 669 684 697 
713 715 727 747 767 769 789 825 839 
842 849 887 896 913 921 925 


young liver 


GEBCO 


ALV001 


3 14 16 37-38 41 51 56 60 97 104-105 
108 110 117 119 128 130-131 134 139 
149 152 169-172 176 184 189-190200 
209 212 216 218 228 232 255 258 263 
270-271 275 285-286 292 295 298-299 
301 304 314 341 358 365 368 376 400 
410-412 431 474 481-482 485 496 500 
504-505 517 520-522 524 530 532-533 
547 551 563 581 583 610-611 621 624 
635 643 691 708 71 1 715 720 752 755 
761 768 796-797 811 818 830 845-847 
852 864-865 867-869 896 899 910-911 
949958 965 969972-973 


adult liver 


Invitrogen 


ALV002 


3 37 42 56 60 71 82 104-105 1 14-1 15 
117-118 125 130-131 134-135 164 169- 
172 176 179 200 203-204 212 217 223 
226 232 237 244 263 274-275 292 301 
310-312 314 317 349 354 364 368 372 
376 398-399 402 426-427 439 442 451 
458 465 474 482 485 490 506 515 525 
527 545 547 552 568 571 573-575 582 
587 594-595 604-605 608 610 621 630- 
631 634-635 637 657 664 690 693 699 
723 726 745 751 763 767 784 793 81 1 
822 845 848 S52 85.* S-51-862 <;64 892 
899 908-909 925 9SQ 958 967 983 


adult liver 


Clontech 


ALV003 


60 134 169-171 275 


adult ovary 


Invitrogen 


AOV001 


1 3 9-10 12-14 16 18 20 22-25 28-29 33- 
35 37 39 41-42 46 48-50 55-57 59 63-67 
69 71-72 75 77-80 82 88-89 92 101 103- 
106 108-110 113 115 119-121 123-126 
128-133 135 138 142-146 149 151-152 
159-161 167-168 172 174 176-177 179 
181 184-190 194 198 200 203 208-209 
21 1-212 214 217 219 221 224 226 232- 
235 240-242 246-247 249 251 254-255 
258-259 264 269-271 274 276-277 279- 
283 285 288 290 293-294 297 301-304 
306-308 31 1 314 319-322 325-326 328- 
329 331-332 335-338 341-342 344 348 
354-358 361-363 365 368 370-372 374 
376 379-380 382-383 388 394-396 398- 
399 401-402 405-406 409-412 416 418- 
421 423 425-433 438 442-443 449-452 
454 462 464 466-467 469-471 474 479 
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482-484 488 490 492-496 498 500-504 
506-509 511 515-518 520-524 529-530 
532-533 537 539-542 545 551 555 558 
560-565 569 571 573 577-578 581-583 
585-590 592-593 596-597 600-605 608 
610-611 613-614 617-628 633-637 639 
642-643 646-648 650 652 654 656 658 
664 668-670 672 674 679 681 684 688 
691 693 697-699 701-702 713 717 721- 
722 724 729-732 738-744 747-750 752- 
753 755 759 761 765 767-774 779-780 
783-784 789 793 795-797 801 813-818 
823-824 828 830-832 834 837 839 841- 
842 845 848-851 856 859 862 864 866- 
867 870-871 874-878 881-883 887-889 
891 893-894 896-897 901 903 906-911 
913 919-922 925 928 930 936 939-940 
943-944 946-947 949-950 952-953 955 
957-958 962-963 965 967 969 971 973 
977 981-982 


adult placenta 


Invitrogen 


APL001 


41 56 67 253 301 304 334 380 383 451 
474 479 500 577-578 643 648 729 767 
856 859 866 873 962-963 


placenta 


Invitrogen 


APL002 


3 21 31 38 63-64 78 135 143 168 186-187 
212 232 244 263 280-281 334 336 344 
348 371 374 394 399 461 490 582 588 
602-607 610 620 699 745 769 793 817 
822 859 897-898 923 928 931 943 949 
969 973 


adult spleen 

- 


GIBCO 


ASP001 


1 3 21-22 46 52 54-55 57-58 61-62 72 74 
78 82 88 118 121 130-131 137 152 159 
172 189 20? ?0O 217 223 234-235 
252 255 263 269 271 274 282 288 290 
301 314 322 335 350 363 394 403 405- 
406 410-412 415 431 459 464 472-474 
482 488 500 506 510 514 517 532 537 
542 561-563 589 593 602-603 610 613 
619 621 636 642-643 655 658 662 674 
676 679 681-682 684 689 691-692 697 
699 715 720 723 729 747-748 769-770 
782 793 818 830 834 845 856 859 862 
877 887 893-894 896 903 906-907 914- 
915 918 925 928 930 940 946 965 967 
977 982 


testis 


GBCO 


ATS001 


6 22 28-29 33-34 41 48 52 62 65 72 97 
106 109 118 132-133 145-146 168 172 
176 183 185 189-191 195 209211-212 
214 221 223 230 254-255 258 263 269 
283 297 312 314 321 342 352 361-362 
365 380 383 388 395 401 405-406 412 
430-431 441 469-470 474 479 495-496 
500 506 520-521 533 543 545 548 560 
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563 574 582 589-590 593 608 616-618 
620 623-624 638 642-^43 697 699 708 
71 1 745 747-748 765 767-768 779 784 
789 812-813 834 837 839 848 859 862 
868-869 875-877 887 889 893-894 896 
928 944 947 953-955 972 981 


Genomic DNA 
from BAC 
63118 


Research 
Genetics 
(CITB BAC 
Library) 


BAC001 


515 


Genomic DNA 
from BAC 
39316 


Research 
Genetics 
(CITB BAC 
Library) 


BAC002 


640 


Genomic DNA 
from BAC 
39316 


Research 
Genetics 
(CITB BAC 
Library) 


BAC003 


640 


adult bladder 


Invitrogen 


BLD001 


50 55 6671 111 143-144 148 160 201 209 
223 255-256 280-281 286 305 315 319 
340 394 431 442 488 497 505 518 552 
588-589 621 636 664 676 715 738-739 
769 790 824 837 845 877 887 936 940 
948 962-963 967 


bone marrow 


Clontech 


BMD001 


3 10-13 16 18 20-21 25 28-29 31-34 41 45 
48 52 54-55 57 59 61 65 67 72-73 75 78 
80 82 84 99 103 108 110 114-115 118- 
120 123-124 128 130-133 143-144 148 
152 159-161 163 168 172 174 176 178 
190 192 198 203 209 21 1 217-218 221 
223-224 227 233-236 244 247 249 252 
254 25S 260-262 267 2C'J 272 278 280- f 
281 284-285 2S8 290 294-297 301 304 ! 
308 314 317-318 320-321 325 328-330 
333-335 349 351-354 358 363 365 367 
377 382 388.394-397 400 405 408 410- 
412 418-421 425-428 431 433 435 442 
449-450 453 455 459 464 468-470 474 
478-479 481 484 490 496 504 506 508- 
509 51 1 519-521 530 532 539 553 558- 
559 561-563 580 582 586 592 599 608 
610 613-614 617-619 623 625-628 635 
638 641-643 658 664 672 682 699 711 
713 717 731 734 740 742-743 745 761 
768-771 774 776-778 784 787 789 813 
817-818 822 834 839-840 842 848 862 
866 870 876 885-887 891 896-898 900 
903 906 913 919 921-922 927-928 939 
944 947 950 953 959 961-963 967-968 
970 973 977 


bone marrow 


Clontech 


BMD002 


3 9-10 15-19 30 33-34 39 45 54 57 63-64 
71 82 102 116 119 130-133 148 152 156 
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159-160 168 176 182 224254-255 271- 
272 282 285 290 297-299 301 305 323 
333 340 344 351-355 358 361-362 364 
367 370 372 387 394-395 399 403 405 
409 411 449-450 459 461 468 474 488- 
489 524 530 532 580-582 592 602-603 
611 617-618 621-622 630-632 642 661 
663 694 717 730 734 740 745 752 755 
761 767 769-771 775-778 784 787 81 1 
813 818 832 840 842 849 859 878 887 
893-894 896-898 903 906 908-909 923 
928 944 946-949 953 958-963 965 982 


bone marrow 


Clontech 


BMD004 


54 


bone marrow 


Clontech 


BMD007 


766 887 928 


adult colon 


Invitrogen 


CLN001 


22 37 67 97 117 121 148-149 168 172 190 
200 204-205 232 244 263 268 292 301- 
302 363 377 384 452 455 459 470 530 
582 602-603 619 687 723 728 751 761 
831 861 887 914-916 934 955 969 984 


Mixture of 16 
tissues - 
mRNAs* 


Various 
Vendors* 


CTL016 


358 740 760 


Mixture of 16 
tissues - 
mRNAs* 


Various 
Vendors* 


CTL021 


468 527 928 


adult cervix 


BioChain 


CVX001 


1 3 10 14 22 28-30 37 41 47-48 51-52 54- 
57 71 82 89-90 92 106 108 110-111 117- 
118 121 129-131 135 141 143-146 160- 
161 164 168 172 177 189-190 193 195 
200 204 209 21 1-212 217 226 229-230 
232 234-235 240-242 246 254 260-263 
268-2; i) V.n 277 .m 292 295 297 
305-308 314-316 319 328 343-344 348 
354 35S 363 368 380 382-384 389 394 
396 399 4t:i 405-407 410 416 418-421 
428 430-431 437 442 453-454 459 464 
469 471-473 476 480 484 492-495 500 
504 506-509 516-517 526 530 532 545 
550-551 563-565 569 577-578 585-586 
590 608 611 613 619 621 623 628 630- 
631 634-637 641 643 648 656-658 664- 
665 674 679 682 689-690 693 700 703 
708 713 721-722 724 728 732 742-743 
747 750 752 755 757 761 763 767-769 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
normal adult kidney mRNA (Invitrogen), 3) normal adult liver mRNA (Invitrogen), 4) normal fetal brain mRNA 
(Invitrogen), 5) normal fetal kidney mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA (Clontech), 
10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus mRNA (Clontech), 12) human lymph ' 
node mRNA (Qontech), 13) human spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) 
human esophagus mRNA (BioChain), 16) human concepa'onal umbilical cord mRNA (BioChain). 
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779-780 784 788 810-811 813-815 822 
834 836-837 839 848 861 866-867 871 
874 877 887 891-894 897-898 901 913 
916 919 921-922 925 946-947 953 958- 
959 967 969 973 


diaphragm 


BioChain 


DIA002 


3 39 184 203 431 563 848 967 


endothelial 
cells 

i 


Strategene 


EDT001 


3 6 8-10 14 19-24 28-29 33-34 37 39 41 
46 48 52 55-58 62-65 67 69 71-72 75 78 
80 82-83 87 101-102 108-109 114-115 
117 123-124 128 130-133 135 138 143 
145-146 149 156 159-160 167-168 172 
174 176-177 179 181 184-187 189-190 
194-195 200 203 208-209 212 216-217 
219 223-224 226-227 229 234-235 244 
248-249 254-256 258 263-264 267 269 
271 274 276-282 285 290-291 294 297 
301-304 308 311 313-314316-317320- 
321 323 325-326 328-329 331-332 334- 
337 339-341 344 348-349 352 354-355 
358 361-363 365 367 371-372 375 379- 
380 383 389 394-395 398-403 405-406 
409-412 425-428 437 442-443 448 454 
464 466-467 474 479 481 490 492-498 
500 503 506-509 511 517 520-521 523- 
524 530 532 537 540-542 558 561-563 
565 569-570 573 581-583 586 588-589 
596 602-608 610-611 613 617-622 625 
628 630-631 633-637 642-643 646 648 
650 652 659 661-662 682 688 690-693 
696 698-699 708 712 715 717 720-722 
724 727 729 740 745 748-750 752 761 
765 767-770 772-77:. "79 734 ?89 7S2 1 
794 796 S 2-303 811 81 7-818 821 824 j 
827-828 830 834-835 837 842 845 848 1 
859 86 1-862 864 866-867 870 876 885 
887 891 893-894 897-898 900 903 906- 
907 913 916 921 925 939 947 950 953 
955 957-958 962-963 967 973 978 984 


Genomic 
clones from the 
short arm of 
chromosome 8 


Genomic 
DNAfirom 
Genetic 
Research 


EPM001 


324 515 640 


esophagus 


BioChain 


ESO002 


97 103 128 371 474 


fetal brain 


Clontech 


FBR001 


67 129 156 159 232 267 433 446 503 845 
952 


fetal brain 


Clontech 


FBR004 


28-29 185 213 277 350 384 432 485 501 
549 651 747 754 761 780 787 848 870 
887 906 958 


ieiai Drain 


Clontech 


FBR006 


10-1 1 14 21 30 32 47 49 56 65 69 72 77- 
78 82 84 97101 115 118 121 125 128 
130-131 138 142 148 152 159-160 179 
185 188 194 197203 210212 214219 
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222227-229 243-246249 252256264 
270 273 282 285 290-291 293 301-303 
305-306 312 321-322 325 327 339-340 
344 346 350 354-357 363 367-371 374 
388 391 394-395 399 402 405-406 410 
414 420 426-427 436-437 442 444 454 
456-457 460 462 464 470 480 485 492- 
494 507 510 516 524 528 530-532 539- 
542 549 553-554 561-562 580-582 588- 
589 602-608 611 615 617-619 621-622 
624 632 636 641-642 646-647 651-653 
661-662 666-669 672 677 691 715-716 
730 735 740 752 754 761 767-770 772- 
775 780-781 799-801 808 818 822-823 
835 843 845 856 859 864 867 876 880 

oor oorr onA on^ Oft/fl onjT rtio m O Cvic 

Kbd 66/ syO 8yi-Jjy4 89o 91 j 918 926 
942 946-947 951 957-959 962-963 970- 
971 


fetal brain 


Clontech 


FBRs03 


130-131 312 517 637 691 738-739 


fetal brain 


Invitrogen 


FBT002 


3 22 28-31 47 57 63-64 72 75 77-78 86 
94-95 97-98 126-127 135 140 143 156 
159-160 167-168 177 185 190 196 201 
203-204 214 217 230 254-255 258 267 
273-274 277 279 282-283 292 301-302 
305 312 314 323 329 346 348 367 374 
382 394 399 401 403 412 415 420 432 
437 474 482 485 495 507 513 517 527 
529-530 539-542 548 552 579 587-588 
600 604-605 612 617-618 621-622 624 
634 642-643 647-648 650 679 689 693 
699 7 12 715 742-743 745 748-749 753 
7o8-769 793 797 829-8.; 1 8?4 845 848 „ 
856 859 89J-S34 5 38-909 913 916 93 1* ""* 
933 940 950 *? 67 969 


fetal heart 


Invitrogen 


FHR001 


19 57 130-131 354 431 642 769 844 


fetal kidney 


Clontech 


FKD001 


3 31 33-34 38 48 54 72 160 208409 21 1 
223 264 269 277 283 290 313 325 341 
348 358 396 418-420 474 484 506 508- 

509 517 520-521 532 547 553 558 567 
369 Do/ oyo oUo 610 ol j 019 522 626- 

AT>1 /Of O tZHQ 11 A 1AZ ft 1 Q QA1 QQ*7 ftCWC 
Oil 04/ Ofy /*0 51 o 00/ oyO 

903 916 969 971 


fetal kidney 


Clontech 


FKD002 


19 474 726 903 


fetal kidney 


Invitrogen 


FKD007 


3 118 186-187 230 244 271 432 887 969 


fetal lung 


Clontech 


FLG00I 


69 132-133 156 168208-209217267269 
274-275 286 354 394 396 406 462 483- 
484 608 619 751 769 771 834 914-915 
925 


•fia+Ckl limit 

lctoi lung 


jnviuogcii 


rlAJwJ 


1 ft Oft OO m <A ftO Qft GO 1 tXQ. 1 ft^_ 
3 0 lo-ly 51 5y jXj TO ol 00 loo loo* 

187 200 204 212 226 229 246 274 309 
327 332 368 374 382 394 398 426-427 
431-432 442 485 536 555-557 587 604- 
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605 621 624 636 642-643 661 677-678 
724 753 769 848 859 864 877-878 896 
902 904 914-915 958 


fetal lung 


Clontech 


FLG004 


130-131 394 664 769 942 


fetal liver- 
spleen 


Columbia 
University 


FLS001 

' - . 


3 8-10 12-13 16-17 19-25 27-29 33-35 37- 
38 41 45-46 48 52 55-58 60-67 69 71-74 
77-78 80 82 84 87-90 104-106 108-109 
112-121 123-125 128-134 138 141 143- 
146 149 151 156 159 163-164 167-172 
174 176-179 181 184 186-188 190 194 
200-201 203 208-209 21 1-212 216-217 
219 224-227 229-230 232 234-235 237 
241 243-244 246-248 254-255 258 260- 
263 267 269-270 273-282 284-285 288- 
290 292-295 297-299 301-306 308 311- 
318 320-323 326 328 332 335 341-344 
348 352 354-359 361-365 367-368 371- 
374 376-380 382-383 388-389394-396 
398-399401-411 413-414416418-421 
425 428-430 432-433 437 439 442-444 
449-450 452 456-457 461-470 472-474 
478-479 481-482 484-485 487 490-494 
497-499 504-507 511 514-515 517-521 
523-524 526 529 532 537 540-541 547 
555 558-559 563 575 577-578 580-596 
598-599 601-603 606-608 610-613 617- 
624 626-628 630-631 634-636 639 642- 
643 647-648 654-656 663-665 672 674- 
675 679 681 684 686 688 691 693-699 
711 713 715 717 719-726 729 732-733 
738-740 745 748-749 751-753 757 759 
761 767-770 776 -778 780 784 7" 7 792- 
794 799 804 809 81 1 i. Vs 817-819 822- 
825 830-831 834 837 840 842 845-848 
852 856 859 861-862 865 867-869 871 
874-878 887-888 891 893-894 896-900 
903 905-911 913 916 918 923 928 930- 
931 936 939 942 944 946-950 952 958- 
959 961-963 965 967 969-970 972-973 
976-977 981-983 


fetal liver- 
spleen 


Columbia 
University ' 


FLS002 


3 8-13 15-17 19-20 22 25 28-29 33-35 37 
41 45-46 52 54-56 60-61 63-64 66-70 73- 
74 78 80 82 92 99 104-106 108-109 112 
115-116 118 120-121 123-125 128 132- 
135 139 141 143-144 146 149 152 156 
159-161 167 169-172 174 176-177 179 
181 185 188 190 194 196-197 200 204 
212 214 216-218 223-224 226-230 232- 
235 237 246-247 252 254-255 258-263 
267 270-277 284-286 288 292 294-295 
297-299 301 303-305 308 310 314 318 
320 323 328 330-332 335-337 340 342- 
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344 352 354-355 358 361-365 367-368 
371 373-374 376-377 382 388 394-396 
398-399401 405-406409-411 413 418- 
421 429 431 439-440 442-444 451-452 
457 462-463 466-468 470 474 477-479 
481 483-484 487-488 491 495 499 504 
508-509 516 519-521 524 526-528 530 
532 537 540-541 543 545-547 550-551 
553 555 560 564 568 574-575 577-578 
580-592 596-597 600 602-603 608 610- 
611 613-614 617-618 621-622 628 630- 
631 634 637 639 642 644 647 654 658- 
659 665-667 669-675 679 681 684-685 
688-690 693 695 697 708 711 713 715 
717-719 723-727 729 731-734 738-739 
741 745-746 749-750 753 759 761 766- 
767 769-770 776-779 782 784 791-792 
794 805 808 817-818 822 824-825 830 
834 837 842 845-849 852 856 859 864- 
865 867 874-878 888 891-892 896-900 
903 905-906 908-909 913 916 918 921 
923 925 932 936 939-940 942 944 946- 
947 949-950 953 955-956 958-959 961- 
963 965 968-970 973 977-978 981 


fetal liver- 
spleen 


Columbia 
University 


FLS003 


19 60 78 224 273 275 370 373-374 401 
602-603 639 643 730 732 738-739 748 
752 770 782 928 930 947 949 


fetal liver 


Invitrogen 


FLV001 


37 55 60 69 72-73 97 104-105 108 1 13- 
114 116-118 121 135 143 152 167-168 
1 86-1 87 1 95 200-20 1 209 21 7 223 240 
244 253 255 275 284 301 311 314 317 
336 342 2 48-349 358 37 I 374 3S2 394 ' 
402 41 1-412 418-419 428 430 $42 453 
517 568-569 580 582 584 587 589 601- 
603 606-608 617-61 8 624 634 639 642- 
644 646 664-665 669 679 715 717 720 
726 745 748 751 769-770 782 791 794 
797 824 830-831 845-847 852 859 870 
899 913-916 925 928 948 956 958 969 
976 982 


fetal liver 


Clontech 


FLV002 


72 418-419632 


fetal liver 


Clontech 


FLV004 


3 160 169-171 355 367 374 376 547 617- 
618 621 646 717 741 771 836 878 976 


fetal muscle 


Invitrogen 


FMS001 


15 27 32 37 67 72 83 99 1 12 121 138 167 
174 177 186-187 190 203-204 211 215 
230 252 259 312 374 403 406 409 457 
461 485 505 517 528 530 540-541 544 
549 554 558 579-580 583 602-603 608 
639 642-643 654 664 699 715 730 737 
751 772-773 788 802-803 810 848 856 
859 864 868-869 887 893-894 905-906 
910-911 923 948 967 
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Invitrogen 


FMS002 


15 99 130-131 223 361-362 431 474 505 
581 639 643 666-667 784 790 808 810- 
811 874 880 887 903 946 950 958 962- 
963 973 


fetal skin 


Invitrogen 


FSK001 


3 6 20-22 32-34 41-45 47 49-52 55 63-64 
66 69 77 80 88 91 98 101 111-112115 
126 130-131 135 142 144 146 160 163 
167 176 188-190 196 201 204 208 213 
215 217-218 229 232 244 246 248 255 
263 265-269 274 279-281 283 285 288 
292 294 297 301 303 308 314 321 341- 
342 344 348 354-355 358 361-362 366 
369 371-372 374 381-382 384 386 394 
401 403 405 413 415 428 431 437 440 
460 466-467 472-473 477 481 483 495 
499 504 517 522 532 536-537 539-541 
545 556-558 569 574 576-578 580 584- 
585 587-589 592-593 602-603 606-608 
612 617-618 621 624 634 637 639 642- 
643 647 664 673-674 676 680-681 689 
699 705-707 709-715 724 728-730 738- 
740 745 748 752 765 768-769 772-773 
793 797 817 823 830 834 842 848 859 
861 864 870 874 883 887-888 893-894 
901 904 908-909 913-916 923 925 947 
950 958 962-964 967 975 


fetal skin 


Invitrogen 


FSK002 


3 130-131 146 194 306 354 367 400 405 
474 489 520-521 547 558 561-562 585 
596 730 740 748 755 767 771 810 840 
893-894 946 959 


fetal spleen 


BioChain 


FSP001 


276 563 842 


umbilical.ccrd 


BioCLain 


FUCOC". 


3 20 33 -31 39 48 50 52 5f-j7 65 57 69 72 ; 
77 79 82 92 109 1 12-113 121 lli:M33 
138-143 156 167-168 172 174 179 184- 
185 190 194-196 200 202-203 208-209 
229-230 244 269-271 278 284-285 290 
297-299 303 305 308 320 331-332 336 
338 342-343 363 367 372 374 379-380 
383-384 392-394 397 399 402 405-406 
410 425-427 429^30 449-450 474 476 
484 497 499 501 504-505 510 515 517 
532-533 539 549 551 558 563 569 574 
577-578 581 586-587 597 602-603 608 
610 617-619 621 626-627 634-637 639 
642-643 658 663-664 674 690-691 693- 
694 699 713 715-717 720 724 726 729 
738-739 746-747 749 759 761 765 768- 
769 774-775 793 797 807 818 822 837 
848-849 856 862 868-869 874 885 887 
892-894 903 906-907 916-917 919-920 
928 936 939 944 946-947 962-963 967 
969 
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fetal brain 


GIBCO 


HFB001 


3 9-10 12-14 16 21 25 28-30 32-34 37-39 
41 47-48 52-53 56 65 67 69 71-72 75 80 
84 92 97 103 106 110 114 117-119 123- 
124 127 129 132-133 135 138 141-142 
144-146 148-149 152 156 159-160 168 
172 174 176 179 181 184-185 190 198 
208-209 212 214 219 221 223-224 229- 
230 233-236 240 244 247 251 253-255 
258-259 270 273 276-277 285 297 304- 
305 308 312 314 322-323 325 328 332- 
333 335-337 339-340 342-344 346 352 
354 358 363 365 370-372 374 382 394- 
396 398 401 403 405-406 409-412 414 
41 6 425-427 43 1-432 437 442 445 453 
456 462 466-467 469-470 472-474 479 
483 488 490 492-497 500-501 504 506- 
510 520-521 524 530 537 539 545 549 
552 558 560-562 564 569 579 582-583 
586-587 596 602-608 610-612 614 617- 
624 626-628 630-631 633 635 638 641 
643 647-648 656 658 661 676 679 688- 
689 693 696-697 711-712 715 724 726 
731 735 745 747-749 752 754 761 765 
767-770 774 779-781 784-786 789 799- 
800 802-803 813 818-819 823-824 831 
834-835 837 839 845 848 859 864 866- 
867 871 874-875 881 887 891 893-894 
896-897 900 906-907 910-91 1 918 921- 
922 925 927-928 930 943-944 946-947 
950 953 962-963 965 969 972-973 977 


macrophage 


Invitrogen 


HMP001 


86 168 186-187 297 537 608 681 761 845 

8?'.:- ■ . 


[ intiant brain 


Columbia 
University 


IB2002 


2-3 9-10 12-14 16 21 25 27-30 32 37-38 
46-47 49 55-56 58 65 69 71-72 78-79 82 
84-86 91-92 98-99 106 109-110 113-115 
118127-128130-133 135 138 142 144 
151 156 168 173-176 180-181 185-188 
192 194 196-201 203 208 210-212 214 
217-218 224 229-231 233 236 238 240- 
241 244 246 251-256 259 263 270-271 
277-279 284-285 287 293-294 296 301- 
302 308 312-314 317 322-323 327 330 
333 339 342 345-346 351 354 358 361- 
362 365-366 368 370-371 373-374 382 
388 394-396 402 405^06 411-412 415- 
416 420 424-425 428 431 436-437 440- 
441 444-445 453 456 460 465 474 479 
482-483 488 495-496 498 501 503-504 
506-510 515-517 520-521 524-525 529 
531-532 534-535 537 539-542 544-545 
549 561-562 569 574 577-578 580-583 
586-587 589 592 596 600-608 610 612- 
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613 616-618 620 622 624 629-632 634- 
635 637 641 643-644 650-651 653 661 
663-664 676-677 689 693 695-698 708 
71 1 720-722.724 730 732 735 740 745- 
748 754 765-766 768-769 779-781 785- 
786 789 791 796 798 800-803 807 811- 
813 818-819 822-824 830-831 834-835 
837 839 842-843 845 854 856 858 864 
867-869 875-877 879 881 887 892-894 
896 903 907-91 1 913 916 919-920 925 
930-932 936 939 943 946-947 953 958 
970-973 977-978 982 984 


infant brain 


Columbia 
University 


IB2003 


3 12-13 21 27-29 32 39 49 69 72 82 91 
113 116 126 128 132-133 142 144 156 
176-177 184-185 188 194208 212 223- 
224 228 230 244 255 259 267 270 273 
276 293-294 312 320 326-327 337 342 
346 354-355 358 361-363 382 388 390 
394 396 399 402 420 425 431 442 462 
474 482 484 488 495-496 510 520-522 
524 529 540-541 549 563 582 586 588- 
589 596 600-603 606-607 612 617-618 
620-621 632 647 650 679 720-722 724 
735-736 746 751 754 769 785-786 793 
800 807 81 1-813 818-819 822 824 831 
834 838-840 843 856 864 892 896 907 
919-920 925 930-931 936 947 950 957 
973 982 


infant brain 


Columbia 
University 


IBM002 


16 47 82 84 201 263 302 376 394 421 440 
488 537 592 606-607 635 740 769 887 
892 906 921 926 971 


infant brain 


Columbia 

Umvemiv 


IBS001 


84 86 180 185 198 201 203 23^ 279 312 
326 346 354 366 38? 438 542 : 58S 
620 647 664 732 740 785-786 801 807 
822 827 910-911 925 931 


lung, fibroblast 


Stiategene 


LFB001 


3 1 1 25 49 65 75 1 14 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


lung tumor 


Invitrogen 


LGT002 


1 3 9-10 12-13 20 31 38 41 46 48 51-52 
56 58 63-64 72 74-75 78 82 88 101 106- 
107 110 114-115 117-118 120-121 123- 
124 128-133 135 143-146 149 151 156 
159-161 163-164 167-168 172 176 178- 
179 184-185 189-191 194-196 200 203 
209 212 216-217 226 228-229 232 234- 
236 241 246 248 256 258-259 263-264 
269-271 274 282-283 285-286 290 292 
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1 


294 297 301 308-309 31 1 314 317 321 
326 328-329 331 333-334 341 348 352 
354-355 363 365 371 380 382-383 388 
394-395 398-402 405-406 410-41 1 413 
416 418-419 426-427 439 442 452-453 
458-459 461-462 464-465 470-471 474 
478 483-484 490 495-496 499 510 522 
524 528 536-537 540-541 543 548 556- 
558 560-565 571-573 580 582 587-588 
592 597 602-605 608 610 612-613 617- 
622 625-629 633-634 636 642-644 648 
661 664 669 679 688-689 691 693 699- 
700 708 717 723-724 730 733-734 738- 
740 745 747 749 752-753 761 767-768 
770 779 782 784-786 789 793-794 797 
817-818 820 823-824 834 837 842 845 
848 855 857 859 862 864 866 870 875- 
877 887 892 896 900-901 907-909 914- 
915 919-920 923-925 939 943 947 949 
953 958 962-963 965 968 970 972-973 
977 


lymphocytes 


ATCC 


LPC001 




3 9-11 32 47 50 56 71 75 88 97 99 102 
121 125 128-129 135 138 141 149 163 
167-168 212-213 217 233 255 290 294 
301 305 31 1 314 342 372 377 388 398- 
399 410 437 442 453 470 474 481 495 
500 506 510 529 532 537 542 558 571 
579 604-605 610 620 628 637 643 658 
666-667 676 679 697 708 713 728 730 
734 749 765 768 796 807 818 822 834 
839 848 859 875 885 887 896 903 906 
914-915:728 947 973 98 


krjcocyte 


GIBCO 


LUC001 




1 3 911 18-19 21 23-2T !>7 31-:i4 39 41- 
42 46-48 52 54-58 62-69 ? i-72 74-75 78- 
80 82 89-90 93 99 110 115-12! 123-124 
128-133 135 138 141 143-146 1*9 152 
156 159-161 163 167-168 176 179 181 
186-187 189-190 194 198 200 203-204 
209 211-212 218-219 226 232-236 240 
244 247 251 253-255 258-259 263-264 
269 271 274 278-279 282-283 285 288- 
290 294-295 297 301-306 311 313-314 
317 320-321 325 328 330-331 335 337 
342 344 348 350-351 353-354 358-359 
361-365 368 371-372 375 388-389 394- 
395 397^01 403 405 407 409-412 421 
425-427 432 437 442 448-450 452 457 
460-461 468-471 474 476 479-482 484 
492-494496-498 500 506-510 516-517 
520-521 524 529-530 532 537 540-544 
551 553-554 558 560-565 569 577-578 
580-583 586-587 589 592 596-597 602- 
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603 606-608 610-624 626-628 630-631 
634-635 641-643 654 657-658 661 663- 
665 669 672 677 679 684-689 691 696- 
697 699 708 711 713 715 717 721-724 
728 730 738-740 747-749 755 761 765 
767-769 771 774-779 782 784 789 791- 
792 794-795 797 807-808 811-815 817- 
818 822 824 828 830 832 834 839-840 
842 845 848 856 859 862 864 867 871 
875-877 887 891 893-894 896-898 903 
906-911 913-916 921 923 925 927-928 
930 932 935-936 939 943-944 947 949- 
950 953 958-959 961-963 965 967 972- 
973 982 


leukocyte 


Clontech 


LUC003 


1 41 82 106 119 123-124 160 177 184 201 
212 221 228 271 279 285 295 321 325 
372 394 411-412 443 468-470 530 532 
537 551 569 580-581 613 619 623 626- 
627 642 655 697 761 767 769. 775 789 
809 867 887 923 928 950 


melanoma 
from cell line 
ATCC #CRL 
1424 


Clontech 


MEL004 


3 25 55-56 67 71 78 109 121 129 146 167 
172-173 176 200 209 212 258-259 263 
278 297 301 306 312 335 338 340 352 
361-362 367 388 395 402 410 418-419 
429 437 454 464-465 481 496 500 503 
507 524 532 539 560-562 581-582 587 
589 599 612-613 617-621 623 643 657 
663-664 672 715 724 748 752 761 767- 
768 770 785-786 789 835 848 877 887 
896 916 919-920 947 967 978-980 


mammary 
gland 


Invitrogen 


MMG001 


1 14 19 21 28-29 31-37 47 49-51 55 57 
53-67 69 7i 71 75-78 92 1 08-109 11 J 1 16 
121 123-124 126 128 130-133 135 
144 148-150 156 159 164 168 172 177- 
179 184 186-187 190 194 200-204 209 
212 217 226 230 232-236 241 244 246- 
247 252 255 258-259 263 268 270 275 
279-283 285 290 292-293 301 304-305 
311 313-314 317 320 322-323 326-327 
330 332 338 342-344 348-349 354 360 
363 367 371 374 380 382-383 385 388 
394-395 398 401-403 407 409 411-412 
418-420 426-427 430 435 437 442 449- 
453 459 461 465-468 470 474 477-478 
480 483 485 488 498 500 503-504 507 
515 519 522 524 529-532 538-541 544 
547 555 560 563 565 569 573-574 579- 
580 582 584 587-589 593 597 601-610 
612-613 615-618 620-622 624 634 636- 
637 639 642-644 646-647 650 657 663- 
664 674 676 679 688-689 691 693 696 
701-703 713 715 717 728 730 732 738- 
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739 741-743 745 749 751 753 763 767 
769 772-773 785-786 793 796-797 812 
821-824 830-833 837 848 856 859 861 
864 868-870 876-877 887 891 893-894 
898 903-904 907-91 1 913-918 921 923 
925-926 930-931 936 942 949-950 958 
961 966-967 969 972-973 


induced neuron 
cells 


Strategene 


NTD001 


9 65 82 92 106 113 142 146 156 172 176 
191 208 221 258 277 328 333 346 361- 
362 371-372 375 388 410 414 418-419 
440 471 484 495 516 524 529-530 592 
610 628 642 650 745 748 752 761 793 
818 848 851 897 


retinoid acid 
induced neuron 
cells 


Strategene 


NTR001 


19 87 184 305 385 440 474 626-627 643 
748 799 834 977 


neuronal cells 


Strategene 


NTU001 


19 33-34 42 70 82 87 109 115 126 146 
172 185 188 194 212 255 269 274 283 
312 317 329 340 361-362 367 379 394 
399 401 410 420 426-427 474 479 507 
530 579 582-583 610 617-618 636 643 
658 732 740 765 769 784 791 793 799 
802-803 818 842 851 864 897 907 932 


pituitary gland 


Clontech 


PIT004 


3 19 123-124 194255 354 358 373-374 
377 426-427 462 492-494 635 785-786 
793 893-894 


placenta 


Clontech 


PLA003 


138 176 574 896 972 


prostate 


Clontech 


PRT001 


3 9 16 57 65 75 83 108 130-134 138 141 
146 149-150 159 182 186-187 190203 
209 234-235 276 283 322 413 415 442 
449-450 453 480 484 490 499-500 503 
505-506 523 537 54: 564 583 C02-CZZ 
61 1 619 (.'23 643 650 697 711 729 7c i 
765 770 776-778 784 789 819 822 831 
839 862 866 887 904 907 921 935 962- 
963 967 973 


rectum 


Invitrogen 


REC001 


19 30 33-34 66 108-109 123-124 126 129- 
131 143 149 151 156 164 190 201 240 
247 250 263 268 274 279 287 295 298- 
299 310 314 332 341 354 384 394 401 
420 425 442 446 459 483 485 520-521 
532 545 559 580-581 584 592 602-607 
610 612 615 619 634 637 646 655 664 
683-684 741 769 793 822 870 908-911 
914-916 934 937-938 942 967 973 982 


salivary gland 


Clontech 


SAL001 


16 68 74 84 121 123-124 156 172 190 203 
209 232 248 254 269 292 294 363 377 
395 398 400 402 405-406 410 430 442 
459 462 474 483 485 563-564 579 587- 
588 599 602-603 643 658 699 728 730 
737 741 748 794 822 867 876 897 903 
981 
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salivary gland 


Clontech 


SALs03 


217 254270 388 610 


skin fibroblast 


ATCC 


SFB001 


517 949 


skin fibroblast 


ATCC 


SFB002 


269688 


skin fibroblast 


ATCC 


SFB003 


3 203 897 907 


small intestine 


Clontech 


SIN001 


3-4 47 57 68-69 92 99 125-126 130-131 
135 149 151-152 156 159 185 204 241 
246 291-292 318-319 338 343 348 363 
373 375 382 388-389 392-394 397 400 
437 466-467 471 484 500 517 520-521 
525 547 560 580-581 588 599 602-603 
612 624 643 71 1 731 733-734 757 761 
769 774-775 794 824 864 904 906 910- 
911913 948 953 959 976 984 


skeletal muscle 


Clontech 


SKM001 


15 75 135 146 172 190 218 267 282 308 
410 426-427 474 505 588 620 623 658 
692 713 737 779 790 862 874 878 887 
952 962-963 


skeletal muscle 


Clontech 


SKMs04 


215 


spinal cord 


Clontech 


SPC001 


14 20-21 25 28-29 31 39 46 48 59 78 83- 
84 91-92 103 112-113 135 160 168 172 
176 188 190 205 209 229 232 258 285 
301 308 312-314 321 323 329 346 374 
377 380 383 388 394 398 406 409-410 
431 449-450 453 455 466-467 470-471 
484-486 488 495 497 500 503 508-509 
524 537 539 558 581 586 604-605 611 
619 623 630-631 633 656 663 711 715 
729 736 740-741 761 767 769 776-778 
780 818 822 831 835-836 840 843 859 
861 871 875 887-888 897 906-907 913 
919-920928 931953 958 


adult SpiccT; 


Clontech 


sPLcOl 


3 6 12-13 rn :;aH31 17* 305 tv; 431 
461 558 610 715 797 809 876 947 967 


stomach 


Clontech 


STO001 


35 114 130-1 j! 144 155 176 189 206-207 
249 260-262 336 3S2 398 425 431 453 
461 483 496 500 527 530 580 642 657 
663 669 748 765 768 802-803 839 891 
942 981 


thalamus 


Clontech 


THA002 


30-32 48 66 109 127 130-131 135 142 
145 156-158 168 172 174 185 199224- 
225 233 246 277 282 286 293 322 332 
334 346 374 384 400 402 420 424 435- 
437 446 466-467 485 503 506 527 542 
549 572 612 615 622 624 633 643-644 
658 676 736 790 794 824 831 835 896 
907 950 969 


thymus 


Clonetech 


THM001 


10 16 20 28-29 32 37 41 52 57 66-67 74- 
75 110 118 121 129-131 141 151 159-160 
208 21 1 218 247 269 289 295 297 320 
325 354 358 365 367 372 378 388-389 
395 398 411-412 420 423 435 452 500 
508-509 517 524 532 537 551 558 560 
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569 577-578 582 586 598 608 611 622 
643 684 715 721-723 728 740 766 772- 
773 795 834 837 849 864 885 900 921 
946 948 958 962-963 965 972-973 982 


thymus 


Clontech 


THMc02 


1 3 9-11 16 21 27 32-34 38-39 51 55-57 
66 72 74 77-78 80 82 89-90 101 1 12 1 15 
118-119 121 123-124 126 138 144 152 
159 168 174 176 178 186-188 197 200 
208 212-214 217 225 233 243-244 246 
254 256-262 279 282 285 288-289 296- 
297 313-314 322 334 343 354-355 358- 
359 363-364 367-368 372-373 382 387- 
389 395 400 402 411 414 426-427 437 
440 442 449-450 454 457 462 464 469 
474 479 481 485 490-491 506 508-509 
511 517 522 526 528 532 542 551 554 
561-562 564 566-570 580-582 585 589 
597 599-600 602-608 61 1 613-614 619- 
621 625 628 630-631 644 646 655 669 
672 677 6,84 686-693 697 713 717 720 
728 740 746 749 760-762 767 771 775 
794 797 804 808 811 816 818-819 837 
840 859 880 883 887-888 896-897 903 
908-911 913 916 924 936 947-948 950 
962-963 965 967 970 


thyroid gland 


Clontech 


THR001 


3 8-9 14-15 19-22 28-29 39 41 55-56 66 
69 71-72 78-79 97 104-105 109 113 115 
119 121 123-124 130-133 135 138 143- 
144 146 148 151-152 156 159-163 165 
168 172 174 177 183-184 196 199-200 
203 209 21 1 215-218 228-229 232-236 
244 254-1; V 258 27? 282 2 r o ■ 
297 303-306 308 311 317-31 8 322-323 
325-326 334-335 340 342 348 354 358 
373 377 381-382 387 394 398 401-402 
405-406 409-412 416 422 425-427 429- 
431 440 449-453 462 466-468 474 478- 
479 481-484 490 492-496 500-501 505- 
506 517-518 522-525 532 537 540-541 
545 551 558 560 563-564 580 583 587- 
589 593 597 599 606-607 610 617-621 
625-628 633 635 641-643 658-659 664- 
669 674 682 686 688-691 696 699 715 
724 730 740 742-743 747 750 752 759 
761 765-766 768-769 779 789 796 802- 
803 813 818-819 822 831 837 843 845 
848-849 862 864 868-869 871 874 876- 
877 887 893-894 896-897 907-909 912 
919-921 923 925 928 936 940-942 944 
946-947 950 953 955 958-959 962-963 
967 969 973 981 


trachea 


Clontech 


TRC001 


33-34 55-56 69 74 163 172 190 209 212 
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267 270 297 305 314 352 413 426-427 
466-467 500 502 504 580 586 610 613 
633 642 688 691 71 1 724 738-739 774 
782 816 820 839 848 862 868-869 914- 
915 928 968 


uterus 


Clontech 


UTR001 


4 9 18 37 63-64 74 108 114-1 15 130-131 
160 166 179 184 190 209 233 249 269 
285 301 314 327 337 348 384 394 399- 
400 403 406 411 425 431 434 437 440 
462 474 485 490 508-509 526 532 579 
617-619 636 642-643 672 761 769 793 
837 849 864 887 903 906 928 934 947 
967 



TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1 


L06175 


Homo sapiens 


occurs in MHC class I region; ORF 


308 


98 


2 


Y70775 


Homo sapiens 


Follistatin-related protein zfsta. 


3094 


98 


3 


X15187 


Homo sapiens 


precursor polypeptide (AA -21 to 
782) 


4112 


100 


A 

4 


A C1 1 n£A/\ 

At 1 10640 


Homo sapiens 


orphan seven-transmembrane 
receptor 


344 


100 


e 
J 




Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7879. 


158 


72 


6 


W85607 


Homo sapiens 


Secreted protein clone da228_6. 


1477 


100 


7 


Y30162 


Homo sapiens 


Human dorsal root receptor 4 
hDRR4. 


884 


88 


o 
o 




Homo sapiens 


Leul 


391 


100 


9 


Y28817 


Homo sapiens 


pt326_4 secreted protein. 


3338 


100 


10 


XQ2106 


Homo sapiens 


bleomycin hydrolase 




1 PA 

Vt > . 




Y15228 


Ho^o s,«piens i 


445 




12 


U27838 


Mus musculus 


glycosyi-phosphatidyi-inositol- 
j^.chored protein homolog 


432 


34 


13 


U27838 


Mils musculus 


glycotyl-phosphatidyl-inositol- 
anchored protein homolog 


320 


27 


14 


Y71062 


Homo sapiens 


Human membrane transport protein, 
MTRP-7. 


2323 


99 


15 


U96781 


Homo sapiens 


Ca2+ ATPase of fast-twitch skeletal 
muscle sacroplasmic reticulum, adult 
isoform 


5145 


100 


16 


M16653 


Homo sapiens 


pancreatic elastase IIB zymogen 


1435 


99 


17 


Y13398 


Homo sapiens 


Amino acid sequence of protein 
PR0346. 


1749 


99 


18 


Y02283 


Homo sapiens 


Secreted protein clone hr342_l 1 
polypeptide sequence. 


1399 


99 


19 


Y53030 


Homo sapiens 


Human secreted protein clone d24_l 
protein sequence SEQ ID NO:66. 


1371 


100 


20 


AL031320 


Homo sapiens 


dJ20N2.5 (novel protein similar to 
mcosidase, alpha-L-1, tissue (EC 
32.1.51, alpha-l-fucosidase 
fucohydrolase)) 


2597 


99 


21 


B01384 


Homo sapiens 


Neuron-associated protein. 


1876 


100 


22 


Y68778 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-10. 


2470 


100 
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SEQ 
ID 
NOt 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






— — : 

Homo sapiens 


Human KHoz protein. 


4781 


99 




I JJ7JJ 


Homo sapiens 


Human Kno2 protein. 


2807 


100 






caenornaDaius 
elegans 


contains similarity to TR;O95029 

- 


463 


31 




V/V70TO 


(Of 


Human secreted protein fragment 


1540 


100 


07 


YOOA1A 


Homo sapiens 


serine/threonine protein kinase 


3781 


98 


OR 


/ir 1JU / jj 


Mus musculus 


microtubule-actin crosslinking factor 


3514 


68 


00 


at idu/dd 


Mus musculus 


microtubule-actin cross J inking factor 


3725 


70 


ou 


•71CA1 1 


Mus musculus 


DMR-N9 


2988 


86 


11 
j I 




Homo sapiens 


axonemal dynein heavy chain 


6058 


99 


32 


AP037256 


Mus musculus 


ES2 protein 


2260 


91 


11 
oJ 


OOZ140 


Homo sapiens 


TLS==nuclear RNA-binding protein 


2917 


100 


i/i 
j4 


ooz 140 


Homo sapiens 


TLS=nucIear RNA-binding protein 


2890 


98 


36 


AB038237 


Homo sapiens 


G protein-coupled receptor C5L2 


1767 


100 


37 


D79994 


Homo sapiens 


similar to ankyrin of Chromatium 
vinosum. 


6089 


99 


1Q 


Xo3380 


Homo sapiens 


serum response factor-related protein 


1966 


99 


Oft 

39 


AL022072 


Schizosacchar 
omyces pombe 


lipoic acid synthetase 


1067 


61 


4U 




Homo sapiens 


alkaline phosphatase 


2751 


100 


41 


AF132968 


Homo sapiens 


CGI-34 protein 


1088 


98 


42 


AL1 17637 


Homo sapiens 


hypothetical protein 


2208 


100 


43 


AL021393 


Homo sapiens 


bK747E2J (novel protein) 


1526 


100 


44 


X68011 


Homo sapiens 


ZNF81 


1886 


100 


45 


AC002464 


Homo sapiens 


organic cation transporter, 50% 
similarity to JC4884 (PID:g2143892) 


2423 


100 


46 


W78245 


Homo sapiens 


Fragment of human secreted protein 
encoded by gene 19. 


1949 


100 


4/ 


Y41765 


Homo sapiens 


Human PRO 1083 protein sequence. 


3604 


100 


4o 


Ar 097330 


Homo sapiens 


HI chloride channel; p64Hl; CL1C4 


1305 


99 


50 


U09413 


Homo sapiens 


zinc finger protein ZNF135 


1361 


57 


51 


AF061812 


Homo sapiens 


keratin 16 


2374 


100 


52 


W63681 


Homo sapiens 


Human secreted protein 1. 


1326 


99 


53 


AB035303 


Homo sapiens 


cadh^rin-10 


4094 


100 




A 12022 


synthetic 
construct 


MRP-* 




100 


55 


AL 12 1897 


Homo sapiens 


bA392M18.3 (KIAA0180) 


1867 


100 


56 


Y73330 


Homo sapiens 


HTRM clone 397663 protein 
sequence. 


818 


96 


DJ 


A CI Kt fit Q 


Homo sapiens 


HSPC184 


955 


100 


<Q 
JO 




Homo sapiens 


bisphosphate 3 '-nucleotidase 


1586 


100 


jy 


A 171 i o<in 
AT 1 A 50 ID 


Homo sapiens 


orphan G protein-coupled receptor 


1971 


100 


ou 


AU44S/4 


Homo sapiens 


precursor polypeptide 


1903 


100 


Ol 




Homo sapiens 


EDKF 


528 


100 


62 


D15057 


Homo sapiens 


DAD-1 


567 


100 


63 


AF260665 


Homo sapiens 


histone acetyltransferase 


1510 


100 


64 


AF2 60665 


Homo sapiens 


histone acetyltransferase 


1429 


96 


65 


AJ277145 


Homo sapiens 


ras-related small GTPase RAB18 


1073 


100 


66 


Y94950 


Homo sapiens 


Human secreted protein clone 
dhl073 12 protein sequence SEQ ID 
NO: 106. 


348 


100 


67 


Y82744 


Homo sapiens 


DNA replication and repair 
associated protein (DRASP). 


1028 


100 


uo 


i*f44oO 


Homo sapiens 


Human urKW receptor polypeptide. 


1721 


100 


69 


AL031228 


Homo sapiens 


(U1033B10.2 (WD40 protein BING4 
(similar to S. cerevisiae YER082C, 
M. sexta MNG10 and C. elegans 
F28D1.1) 


3196 


100 
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SEQ 


ACCESSION 


SPEfTES 


DESCRtPTltiN 


SMITH- 




ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 




70 


AJ276316 


Homo sapiens 


zinc finger protein 304 


1751 


52 


71 


Y18314 


Homo sapiens 


paraplegin-like protein 


4146 


99 


72 


AF157028 


Homo sapiens 


protein phosphatase methylesterase-1 


2017 


100 


74 


Y71082 


Homo sapiens 


Human B-aggressive lymphoma 


1765 


99 








(BAL) protein. 






75 


AF225420 


Homo sapiens 


AD025 


734 


100 


76 


X95235 


Homo sapiens 


transcription factor AP2 


217 


100 


77 


AF108420 


Takifiigu 


1-ammocyclopropane-carboxilate 


733 


56 






rubripes 


synthase 






78 


G01349 


Homo sapiens 


Human secreted protein, SEQ ID 


650 


99 








NO: 5430. 






79 


AL1 17635 


Homo sapiens 


hypothetical protein 


922 


99 


81 


Z85986 


Homo sapiens 


dJ108Kl 1.3 (similar to yeast 


865 


77 








suppressor protein SRP40) 






82 


AF183414 


Homo sapiens 


hemin-sensitive initiation factor 2a 


3231 


99 








kinase 






83 


G01143 


Homo sapiens 


Human secreted protein, SEQ ID 


495 


98 








NO: 5224. 






84 


U03985 


Homo sapiens 


N-emylmaleinude-sensitive factor 


3744 


99 


85 


Y17791 


Homo sapiens 


VAX2 protein 


1496 


100 


87 


AF263538 


Homo sapiens 


growth differentiation factor 3 


1944 


99 


88 


Y19757 


Homo sapiens 


SEQ ID NO 475 from W09922243. 


1361 


100 


89 


AF161493 


Homo sapiens 


HSPCI44 


1185 


100 


90 


AF161493 


Homo sapiens 


HSPC144 


856 


100 


91 


B25780 


787 


Human secreted protein SEQ ID 


647 


41 


92 


U57344 


Mus musculus 


Meis3 


1007 


89 


93 


AF172854 


Homo sapiens 


cardiotrophin-like cytokine CLC 


1197 


98 


94 


AL390114 


Leishmania 


extremely cysteine/valine rich 


223 


29 






major 


protein 






95 


AB016886 


Arabidopsis 


contains similarity to adenylate 


287 


38 






thaliana 


kinase~gene_id:MCA23 . 1 8 






96 


AC005525 


Homo sapiens 


F22162 I 


1855 


96 


97 


B20997 


Homo sapiens 


Human nucleic acid-binding protein, 


3836 


99 








NuABP-1. 






98 


AJ006692 


H&ino i^ipieni* 


lugh surfer Ke.\;iL; 


507 




99 


AF172264 


Homo sapiens 


Trar2 and NCK interacting kinase, 


69«I: 


99 








splice variant 1 






100 


LI 1239 


Homo sapiens 


homoobox protein 


717 


100 


101 


AC004890 


Homo sapiens 


similar to zinc finger proteins; 


2154 


98 








similar to AAC01 956 












(PID:g2843171) 






102 


AC003682 


Homo sapiens 


R28830 2 


1287 


48 


103 


AF201839 


Rattus 


dynamin mbb isoform 


4270 


95 






norvegicus 








104 


Y79510 


Homo sapiens 


Human carbohydrate-associated 


1394 


100 








protein CRBAP-6. 






105 


Y79510 


Homo sapiens 


Human carbohydrate-associated 


1209 


90 








protein CRBAP-6. 






106 


AL096748 


Homo sapiens 


hypothetical protein 


1216 


100 


108 


X97260 


Homo sapiens 


Metallothionein 2 


381 


100 


109 


AL034422 


Homo sapiens 


dJl 141E15.2 (novel protein) 


433 


100 


no 


AF191338 


Homo sapiens 


anaphase-promoting complex subunit 
4 


683 


100 


111 


AL021712 


Arabidopsis 


putative protein 


185 


26 














112 


AF250138 


Homo sapiens 


small stress protein-like protein 


1063 


100 








HSP22 






113 


AL109976 


Homo sapiens 


dJ794I6.1.1 (novel protein) 


4176 


99 


114 


Y36151 


787 


Human secreted protein 


668 


100 



129 
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115 


AF1 10399 


Homo sapiens 


elongation factor Ts 


1666 


100 


116 


AF210317 


Homo sapiens 


facilitarive glucose transporter family 
member GLUT9 


2052 


99 


117 


Y73328 


Homo sapiens 


HTRM clone 082843 protein 
sequence. 


931 


100 


118 


X04085 


Homo sapiens 


catalase 


2846 


100 


119 


AF147717 


Homo sapiens 


ubiquitin C-terminal hydrolase 
UCH37 


1695 


100 


120 


X73882 


Homo sapiens 


microtubule associated protein 


3801 


99 


121 


AC004882 


Homo sapiens 


similar to CAA16821 
(PK>:g3255952) 


3223 


100 


122 


M93311 


Homo sapiens 


metallothionein-ni 


421 


100 


123 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7908. 


557 


94 


124 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7908. 


222 


53 


125 


AF232009 


Homo sapiens 


peroxisomal trans 2-enoyl CoA 
reductase 


1565 


99 


126 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


127 


M60165 


Homo sapiens 


guanine nucleotide-binding 
regulatory protein 2 


1832 ' 


99 


128 


Y10319 


Homo sapiens 


carnitine carrier 


1592 


100 


129 


U75467 


Drosophila 
melanogaster 


Atu 


937 


36 - 


130 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 


494 


87 * 


131 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 


938 


100 


132 


Y58633 


Homo sapiens 


Protein regulating gene expression 
PRGE-26. 


6745 


100 


133 


Y58633 


Homo sapiens 


Protein regulating gene expression 
PRGE-26. 


4818 


95 


134 


M13692 


Homo sapiens 


alpha- 1 acid glycoprotein precursor 


1064 


99 


135 


U72970 


Sus scrofa 


calcium/calmodulin-dependent 
protein kinase II isoform gamma-B 


2723 


99 


136 




Homo rattens 


Human secreted protein. SEQ ID 
NO: 7294. 


450 , ; 00 1 

* 


137 


AC005102 


Homo sapiens 


small inducible cytokine subfamily A 
member 24 


627 


99 


138 


AF155v;4S 


Homo sapiens 


putative zinc finger protein 


5855 




139 


AF144638 


Homo sapiens 


sphingosine-l-phosphate lyase 


2977 


100 


140 


AF152318 


Homo sapiens 


protocadherin gamma Al 


4778 


100 


141 


B08517 


Homo sapiens 


Amino acid sequence of a beta- 
tubulin antigen. 


5841 


100 


142 


X56667 


Homo sapiens 


calretinin 


1410 


99 


143 


X92763 


Homo sapiens 


tafaTzins 


1605 


100 


144 


Y95293 


Homo sapiens 


Human GEF containing NEK-like 
kinase substrate sGNK. 


4092 


99 


145 


AF226046 ■ 


Homo sapiens 


GK003 


1198 


100 


146 


M22877 


Homo sapiens 


cytochrome c 


554 


98 


147 


AJ272212 


Homo sapiens 


protein serine kinase 


2196 


100 


148 


AB026491 


Homo sapiens 


PICK1 


2114 


98 


149 


ABO 18580 


Homo sapiens 


hluPGFS 


1699 


100 


150 


X91868 


Homo sapiens 


sixl 


1509 


100 


151 


AF266505 


Mus museums 


pseudouridine synthase 3 


2135 


84 


132 


T Ton 1 *Tf\ 

U29170 


Drosophila 
melanogaster 


ANON-23D 


883 


43 


153 


G04075 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8156. 


567 


99 


154 


AY009128 


Homo sapiens 


ISCU2 


138 


100 



130 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/DS01/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 
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IDENTITY 








alnho. 1 /l-'KT- 
aipna- 1 ,*t-rN- 

acetylglucosaminyltransferase . 


1 O/O 


100 




AF1 1ft£4S 


nomo sapicnb 


can uj time tumor suppressor 

TWOI hfiTtinirto 

Li 1 VJ 1 IlUlUUIUg 


12^4 


99 


157 


AF159297 


Zeamays 


extensin-like protein 


238 


25 


1 JO 




nUiUU bopiciia 


suytrrr f j ^noincoDox protein 


14j / 


100 


159 


AF073298 


Homo sapiens 


small EDRK-rich factor 2 


294 


100 


lOU 


A^UUfOJO 


Homo sapiens 


ui smaii noonucieoprotem loNKr 
nomoiog, niaicn 10 jrJLi^.gHujuuo / 


4032 


100 




ARA1910Q 


nomo sapiens* 






100 


162 


AL162751 


Arabidopsis 
thaliana 


putative protein 


194 


32 


xOj 


AJUUD07O 


Homo sapiens 


poly(A)-speciflc ribonuclease 


3351 


100 


ID** 


Ar 1 1 /two 


Homo sapiens 


long protem 


2547 


99 


103 




Homo sapiens 


similar to ciliary dynein beta heavy 

Attain* TfQOX. Ctmi'layihi tn OOO Af\0 

cnain, / 0/0 oimuariiy to rj.i\)yo 


5065 


100 


IW 




nomo Sapiens 


numon mcuuiomionein-ie 


iol 


1 AA 

100 


to/ 




nomo sapiens 


PAPT1A 

WTJ\JL/*r 


4yol 


100 


too 


AFIrilSIK 


nomo Sapiens 


nor\^107 


1604 


100 






nomo Sapiens 


iionnogen oexa cnain 


2482 


100 


170 


M64983 


Homo sapiens 


fibrinogen beta chain 


2679 


100 


171 
1/1 


IVlJOJ 1H 


Gallus gaLEus 


fibrinogen beta chain 


1059 


78 


172 


AF078845 


Homo sapiens 


16.7Kd protein 


786 


100 


J /J 


A f^l\(\AHHA 

AUUU4 / /4 


Homo sapiens 


Dlx-o 


923 


100 


1 HA 
1 /4 


Z98974 


Schizosacchar 
omyces pombe 


putative vacuolar protein sortmg- 
associated protem 


185 


31 


175 


X56203 


Plasmodium 
falciparum 


liver stage antigen 


283 


23 


1 /O 


W /4/ZO 


Homo sapiens 


Human secreted protein fg949_3. 


1879 


100 


hi 




Homo sapiens 


cystinosin 


1920 


100 


178 


AC024796 


Caenorhabditis 
elegans 


contains similarity to TR:076167 


221 


27 


179 


Y66632 


* &rmo sapi *ns 


Ms^?x*ne-bound protem FP.0276. 


5370 


100 " 


» OA 

lov 




Homo sapiens 


fXJl-45 protein 


215 


28 


181 


G02694 


Homo sapiens 


Hiroan secreted protein, SEQ ID 
NO: t>77*. 


283 


100 


loZ 




— s 

Homo sapiens 


Human cell death preventing kinase 
(DPK-1) protem sequence. 


2676 


100 


183 


AF234765 


Rattus 
norvegicus 


serme-arginine-rich splicing 
regulatory protem SRRP86 


148 


27 


I OH 


AT71 <1 Q« 

Arl jIoDj 


Homo sapiens 


CGI-97 protein 


1214 


96 


185 


AF289664 


Mus musculus 


CYLN2 


4673 


90 


186 


AL022238 


Homo sapiens 


dJ1042KI0.2 (supported by 
GENSCAN, FGENES and 
uENxiWlSE) 


4059 


100 


1517 
lo / 


at tYyyyxsi 


— : ; 

Homo sapiens 


oj 1 042K1 vu (supported by 
GENSCAN, FGENES and 

UHiNiiWloxlJ 


2332 


100 


188 


X83543 


Homo sapiens 


APXL 


8513 


99 


IRQ 
toy 




nomo sapiens 


actm oinaing protein MAYVEN 


3106 


99 


190 


M18135 


Rattus 
norvegicus 


smooth-muscle alpha tropomyosin 


1306 


95 


191 


AF242194 


Drosonhila 
melanogaster 




147 
if/ 




192 


D30689 


Bacillus 
subtilis 


subunit of nitrite reductase 


113 


29 


193 


Y44984 


Homo sapiens 


Human epidermal protem- 1 . 


538 


97 



131 
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SMITH- 
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SCORE ' 


% 

IDENTITY 


1 OA 


dz jo /y 


xiomo Sapiens 


— : 

Human secreted protein sequence 
encoded by gene 15 SEQ ID NO:68. 


/OU 


100 


17 J 




/o/ 


homologue of mouse dkk-1 generAcc 


1400 


100 


1QA 
170 




NIus mus cuius 


jerky 


2U21 


75 


1 QT 


at I'j/oi^n 


Homo sapiens 


qjm uui i . i (novel protein ) 


632 


* 100 


198 


X56203 


Plasmodium 
iaiciparum 


liver stage antigen 


512 


24 


199 


Y70775 


Homo sapiens 


Follistatin-reJated protein zfsta. 


2027 


63 


20U 


X87237 


Homo sapiens 


a-glucosidase I 


4447 


99 




Arl0107o 


a _-ij_a. n iiIb a 

Caenornao cutis 
elegans 


/"IT TT 1 


1393 


46 


202 


X04571 


Homo sapiens 


precursor polypeptide (AA -22 to 
1185) 


6611 


100 


OAO 

205 


X00474 


Homo sapiens 


pS2 precursor 


466 


100 


204 


AB 029333 


TT— 1 a_i_ * _ 

Halocyntbia 
roretzi 


rT_TYt? r P 1 

HrPET-1 


974 


54 


205 


A 171 A£f\t n 

AF146019 


Homo sapiens 


hepatocellular carcinoma antigen 
gene dzu 


998 


100 


206 


AF071002 


Homo sapiens 


minK-related peptide 1 ; MfiRPl 


632 


100 


1AT 

207 


A DAI 0 1 ^O 

AB 03 8 162 


Homo sapiens 


trefoil factor 2 


744 


100 


208 


U30521 


Homo sapiens 


TV* 1 1 T TT TK K 

P311 HUM 


363 


100 


209 


ADAAAA1 1 

AB00091 1 


Sus scrofa 


ribosomal protein 


782 


100 


210 


AB021227 


Homo sapiens 


membrane-type-5 matrix 
metalloproteinase 


3545 


100 


211 


AF 180920 


Homo sapiens 


cyclih L ania-6a 


2722 


100 


212 


AF 1 05365 


Homo sapiens 


K-Cl cotransporter KCC4 


5624 


100 


2 13 


T TOfV» A A 

U29244 


Caenorhabditis 
elegans 


similar to human (TRE) transforming 
protem (PIR:S22 1 57) 


602 


32 


214 


AL033538 


Homo sapiens 


OJ477H23.1 (novel protem) 


3195 


100 


215 


Vfini i 

X52011 


Homo sapiens 


muscle determination factor 


1262 


100 


216 


AF083248 


Homo sapiens 


ribosomal protein L26 homolog 


739 


100 


217 


AF006751 


Homo sapiens 


ES/130 


4793 


99 


218 


A T3ASVTOCA 

AB007859 


Homo sapiens 


KIAA0399 protem 


3559 


99 


219 


AKQ26291 


Homo sapiens 


unnamed protein pn>duct 


826 


100 


! "V" , 


Y«i404_s 


Homo sapiens 


Splice variant of -caiic-i; asst dalptl 
poiyp^Kide CHl-9ai 1-2. * 




3*7 


t oil 

^_ 222 


£67996 


Homo sapiens 


tenascin-R (restrictm) 


7186 


100 


22.v 


A 171 1 A OAO 

AF 1 34 802 


Homo sapiens 


cofilin isoform 1 


S^A 


100 


224 


Y17711 


Homo sapiens 


atopy related autoantigen CALC 


1611 


99 


225 


A 171 AAAC 1 

AF190051 


Gallusgallus 


hepatocyte nuclear factor la 
dimerization cofactor isoform 


443 


81 


220 




Homo sapiens 


unnamed protem product 


866 


98 


22/ 




Schizosacchar 
omyces pombe 


nu£2-like coiled-coil protem 


230 


25 


228 


AF275948 


Homo sapiens 


ABCA1 


11763 


99 






Homo sapiens 


HorvJ266 ■ 


OAA£ 

2006 


98 


2JU 


I lOZ/0 


Homo sapiens 


paralemin 


1951 


100 


231 


AJ245599 


Homo sapiens 


putative secreted ligand 


2379 


99 


232 


Ti/oo Ann 

W88499 


Homo sapiens 


Human stomach carcinoma clone 
HP10412-encoded protein. 


1545 


99 


233 


Ar 096286 


Musmusculus 


pecanex 1 


3623 


93 


234 


V64ol9_ca 
1 


Homo sapiens 


a\ \t/\i t •% t\ W fill * TV\ T A 

30-NOV-1990 Human HE1 cDNA. 


796 


100 




1 


nuuiu sapiens 


jLriNL/v-iyyif Human rici cuina. 


4/U 


no 
95 


236 


AF227258 


Bostaurus 


RPGR-interacting protein- 1 


1262 


38 


237 


AJ132445 


Homo sapiens 


claudin-14 


1181 


100 


238 


AL034562 


Homo sapiens 


dJ684Q24.2 (prodynorphin (Beta- 


1330 


100 



132 
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oivii i a- 
WATERMAN 
SCORE 


% 

IDENTITY 








Neoendorphin-Dynoiphin precursor, 
Proenkephalin B precursor)) 






239 


AF262027 


Homo sapiens 


elF-5A2 


808 


100 


240 


AL079344 


Arabidopsis 
thaliana 


putative protein 


194 


33 


241 


AC002394 


Homo sapiens 


Gene product with similarity to 
dynein beta subunit 


1542 


51 


242 


AJ271361 


Takifugu 
rubripes 


FRANK2 protein 


303 


30 


243 


AL021918 


Homo sapiens 


D34I8.1 (Kruppel related Zinc Finger 
protein 184) 


1476 


48 


244 


AF190167 


Homo sapiens 


membrane associated protein SLP-2 


1736 


99 


245 


Y10601 


Homo sapiens 


ankyrin-like protein 


5877 


100 


246 


AL121771 


Homo sapiens 


dJ548G19.1.1 (novel protein 
(ortholog of mouse zinc finger 
protein ZFP64) (translation of cDNA 
NT2RP3001398 (Em:AK001596)) 
(isoform 1)) 


3628 


100 


247 


L25314 


Drosophila 
melanogaster 


actin-related protein 


984 


47 


248 


X63745 


Homo sapiens 


KDEL receptor 


1095 


100 


249 


AF1 12208 


Homo sapiens 


13kDa differentiation-associated 
protein 


816 


100 


250 


AP001707 


Homo sapiens 


human gene for claudin-8, Accession 
No.AJ250711 


1172 


100 


251 


AL136125 


Homo sapiens 


(1J304B14.1 (novel protein) 


778 


100 


252 


AL031186 


Homo sapiens 


bK984Gl.l (supported by FGENES) 


532 


100 


253 


Y17531 


Homo sapiens 


Human secreted protein clone BL205 
14 protein. 


639 


100 


254 


AL049843 


Homo sapiens 


dJ392M17.3 (KIAA0349 protein) 


6741 


99 


255 


AJ242972 


Homo sapiens 


TOLLIP protein 


1424 


99 


256 


Y94873 


Homo sapiens 


Human protein clone HP02632. 


1876 


100 


257 


AF279865 


Homo sapiens 


kinesin-like protein GAKTN 


2903 


100 


258 


AL024498 


Homo sapiens 


dJ417M14.1 (novel protein) 


589 


100 


25y 


R66278 


Homo sty. jns 


Therapeutic ; .* / e . ; -pride Horn 
glioblastoma cell i^e. 


8:o 


100 


260 


AF101784 


Homo sapiens 


b-TRCP vari^it E3RS-IkappaB 


3226 


99 


261 


AF101784 


Homo sapiens 


b-TRCP variant E^RS-DcappaB 


2821 


100 


262 


AF101784 


Homo sapiens 


b-TRCP variant E3RS-IkappaB 


3149 


99 


263 


AF 197060 


Homo sapiens 


src homology 3 domain-containing 
protein HIP-55 


2257 


100 


264 


Y86262 


Homo sapiens 


Human secreted protein HAQAR23, 
SEQIDNO:177. 


766 


100 


265 


Y56966- 


Homo sapiens 


Human SBPSAPL polypeptide. 


2779 


100 


266 


Y56966 


Homo sapiens 


Human SBPSAPL polypeptide. 


1018 


99 


267 


AJ300465 


Homo sapiens 


putative white family ATP-binding 
cassette transporter 


1557 


95 


268 


AC004030 


Homo sapiens 


F21856 2 


3579 


99 


269 


X55954 


Homo sapiens 


HL23 ribosomal protein 


714 


100 


270 


AB033921 


Mus musculus 


Ndrl related protein Ndr2 


1855 


94 


271 


AF081886 


Homo sapiens 


EROl-tike protein 


1905 


99 


272 


AF166492 


Homo sapiens 


small GTPase RAB6B 


1060 


100 


273 


AL022238 


Homo sapiens 


(U1042K10.4 (novel protein) 


2201 


100 


274 


W88667 


Homo sapiens 


Secreted protein encoded by gene 
134 clone HATBP89. 


1530 


99 


275 


X00129 


Homo sapiens 


precursor RBP 


1044 


97 


276 


Z47500_cdl 


Homo sapiens 


1 l-MAY-1998 Human RHOH gene 
sequence. 


1161 


100 


277 


AB049188 


Equus cab alius 


ubiquitin C^enninal hydrolase 


1118 


96 



133 
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278 


AF270647 


Homo sapiens 


GTT1 


1564 


100 






jvius iriiiscuius 


coroDin-x 


"iA 1 A 

2414 


94 




rvOJ iJi 


riomo Sapiens 


cnuomeuai ceu pojvpepuoe. 


Oil 

911 


92 


281 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


1031 


100 


282 


D83948 


Rattus 
norvegicus 


Sl-1 protein 


3975 


90 




VI A TAB 


Homo sapiens 


I Kappa B-like protein 


2037 


100 


ZOO 


AT A1111/C 


Homo sapiens 


dJ2oUlU.3(riMJl lol 
(hydroxysteroid (1 1-beta) 

hill t/WA A AH A 0 A I 1 

aenyarogenase i) 


294 


100 


. 287 


D64109 


Homo sapiens 


tob family 


1773 


99 




AB026043 


Homo sapiens 


MS4A7 


1230 


100 


289 


M61866 


Homo sapiens 


Krueppel-related DNA-binding 
protein 


209 


90 


290 


AJ001810 


Homo sapiens 


mRNA cleavage factor 1 25 kDa 
subunit 


1217 


100 




y 99434 


Homo sapiens 


Human PRO 1605 (UNQ786) ammo 
aciu sequence oeki lv NU:jy5. 


694 


100 


292 


Y44824 


Homo sapiens 


Human molecule associated with ceil 
proliferation, MACP-4. 


2370 


100 


293 


AJ276101 


Homo sapiens 


GPRC5B protein 


2099 


100 


294 


AF161406 


Homo sapiens 


HSPC288 


719 


100 


295 


Y58628 


Homo sapiens 


Protein regulating gene expression 
PRGE-21. 


1276 


100 


296 


U91561 


Rattus 
norvegicus 


pyridoxine S'-phosphate oxidase 


1239 


87 


297 


L02956 


Xenopus 
laevis 


ribonucleoprotein 


1624 


83 


298 


AF226730 


Homo sapiens 


Cytl9 


1729 


99 


299 


AF226730 


Homo sapiens 


Cytl9 


906 


98 


300 


Y54324 


Homo sapiens 


Amino acid sequence of a human 
gastric cancer antigen protein. 


718 


89 


301 


AF125533 


Homo sapiens 


NADH-cytochrome b5 reductase 
isoform 


1606 


100 






Homo sapiens 


Human receptor molecule (REG) 
encoded by Incyte clc .;t- .<x»25826. 


1675 




303 


AP24r565 


Homo sapiens 


hepatocellular carcinoma associated 
ring finger protein 


525 


* 100 




ArzUo©44 


Homo sapiens 


dM-UU2 


428 


10u 






Homo sapiens 


smular to PLD:g3 877944 


1988 


100 


306 


AL132978 


Arabidopsis 


putative protein 


210 


25 






Homo sapiens 


olfactory receptor 


1645 


100 


308 


AF180681 


Homo sapiens 


guanine nucleotide exchange factor 


3597 


100 




ATI 1 T QC^ 

At lllooo 


Homo sapiens 


sodium dependent phosphate 
transporter isotonn Nari-J d 


3591 


99 




I Ijjoj 


xiomo sapiens 


G-protein coupled receptor 


2171 


100 


i 

jii 




riomo sapiens 


cci40uiu.2 ^mercaptopyruvate • 
sujiunransierase (liu z.o.i./.)) 


1 coo 

1598 


100 


312 


X79535 


Homo sapiens 


beta tubulin 


2348 


100 






Homo sapiens 




Sol 


100 


314 


AF078866 


Homo sapiens 


SURF-4 


1395 


100 


31/ 


Z37980 


Homo sapiens 


phenylalkylamine binding protein 


1258 


100 




AJB047892 


Macaca 

fasci ciil art s 


hypothetical protein 


258 


82 


321 


Y25755 


Homo sapiens 


Human secreted protein encoded 
from gene 45. 


1440 


100 


322 


AB016531 


Homo sapiens 


PEX16 


1741 


100 


323 


AL391141 


Arabidopsis 


putative protein - 


274 


49 
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ID 


ACCESSION 
NUMBER 


SPECIES - 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






UlallaUul 












Hryrnn conipnc 


i . 

DNA polymerase iota 


3691 


99 


326 


X96698 






1 a cn 


96 


327 


AF152325 


Homo sapiens 


protocadherin gamma A5 


4769 


100 




afi sisn^ 

r\JT ux ov/j 


iiuniu sapiens 


L/Lji-4j protein 


1 At A 

1970 


100 


329 


X74070 


Homo sapiens 


transcription factor BTF3 


639 


81 




A 171*71 1 fY) 
ATI /lllKZ 


Homo sapiens 


retinal degeneration B beta 


1302 


95 


331 


W54040 


Homo sapiens 


Human interferon-inducible protein, 

TJTTJT 


484 


98 


332 


AF024617 


Homo sapiens 


transcription-associated zinc ribbon 
protein 


691 


100 


333 


U19181 


Rattus 
norvegicus 


Rabin3 


2129 


90 




vjUJo// 


Homo sapiens 


Human secreted protein, SEQ ID 


621 


100 






— ; 

Homo sapiens 


dK1>2jH9.2 (ortbolog of A. thaliana 
F23F1.8) 


626 


100 


336 


AF1 10774 


Homo sapiens 


adrenal gland protein AD-001 


647 


100 


iot 


A Dm tAt A 

AtJU11414 


Homo sapiens 


Kruppel-type zinc finger protein 


1674 


58 


53© 


Arzu/ouO 


Homo sapiens 


ethanolamine kinase 


129 


100 


340 


AC020579 


Arabidopsis 
tfaaliana 


putative 

phosphoribosylformylglycinamidine 
synthase; 25509-29950 


3283 


50 


1 A 1 

341 


Y28576 


Homo sapiens 


Secreted peptide clone pe503 1. 


944 


100 


342 


U32274 


Saccharomyce 
s cerevisiae 


Ydr386wp; CAI: 0.12 


191 


37 


343 


AO 1771 


synthetic 
construct 


vascular anticoagulating protein 


1661 


99 


344 


AF220052 


Homo sapiens 


uncharacterized hematopoietic 
stem/progenitor cells protein 
MDS032 


1285 


100 


345 


Y70400 


Homo sapiens 


Human cell-signalling protein-2. 


754 


100 


1A£ 

34o 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6JL derived protein. 


962 


100 


347 


A 171 OO ^1" 


Homo ^spienr* 


7 Si * ? :Da protein 


I 1329 -< 




54^ v. 


AUOOoUoy 


Arabidopsis. 
thaliana 


puta^v s cleavage and 
polyadenylation specifity fector 


13^ 


S5 . 


lAd 


AL032631 


Caenorhabditis 
elegans 


YIC-tG6TL8 


194 


39 




U7U669 


Homo sapiens 


Fas-Iigand associated factor 3 


167 


23 




i9j4oo 


Homo sapiens 


Amino acid sequence of a potassium 
channel interactor protein. 


1182 


92 


352 


AF005856 


Drosophila 
yakuba 


anon2A5 


111 


45 


353 


AJ271684 


Homo sapiens 


myeloid DAP 1 2-associating lectin 


1013 


100 






Homo sapiens 


Wl>repeat protein 6 


2882 


99 


355 


U51730 


Murine 

leukemia virus 


reverse transcriptase 


316 


42 




LOUol / 


Saccharomyce 
s cerevisiae 


YFL042C 


279 


27 


jj / 


UDUOl t 


Saccharomyce 
s cerevisiae 


YrJLU4zC 


279 


27 


J JO 


Ar 1014JZ 


Homo sapiens 




1059 


93 


359 


AB029488 


Homo sapiens 


CllorGl 


758 


99 


360 






puutuve uuoraui Dinaing protein ag 


1 TOO 

1239 


t AA 

100 


361 


U43281 


Saccharomyce 
s cerevisiae 


Lpg22p 


2074 


74 


362 


U43281 


Saccharomyce 
s cerevisiae 


Lpg22p 


2153 


74 
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% 

IDENTITY 


363 


AC0071 


A ranislrincic 
■rVl a U lUUpaiS 

u mi i in 


100632 


1 DO 


24 


364 


AF1Q7Q97 


HTfirnn ciini pnc 

livlUU 2MipiCU3 


AFSnll nrntf»in 

.fY* 11 J X LMUlWill 


3QQ7 


99 


365 


D28500 


Homo sapiens 


mitochondrial isoleucine tRNA 

OJr MX UtwUUv 


4286 


98 


366 


X97868 


Homo sapiens 


arylsulphatase 


3141 


98 


367 


AT 1&70AR 


nuiuo sapiens 


n vn#tf frn^i 1 ran4atn 

nypviacucai pruicin 


1DJ2 


100 


368 




xvi uo niuscuius 


ousruiuugciiic acuic regulatory 

nrt\t(*\n 
|/iU twill 


i on 

lay 


25 


369 


AF1 13249 


Homo sapiens 


multiple domain putative nuclear 


1022 


59 


370 


VfLXJOOO 


duo uiuius 


ciiuuocpiiiv~i cioicu pi uicui precursor 


7/17^ 


OA 

84 


371 




XlUiUU SapiCllS 


OAi"rnp/tfii*A/\ntYi^ tvpf\l*»iTi 1rinae& 

sci luc/ uii cuijuiic pruicin junase 




100 


372 


W74802 


Homo sapiens 


Human secreted protein encoded by 


1532 


89 


373 


AF 100779 

nx 1 UV) / 7 


X1UJLUQ sapiens 


tan o er*iYi .M 1 
LCiiCl^Ciil'lVl 1 


1 1 C2« 

1 iDij 


99 


374 


AF0Q0Q34 


XxvlilU SapiCIlS 




107 


100 


375 


AB021643 


Homo sapiens 


gonadotropin inducible transcription 
reprcssoro 


2761 


99 




/TLDvfjT / JO 


xiuino sapiens 


ivia vru oinuing protein 


1331 


• 100 


377 


AF070666 


Homo sapiens 


Kruppel-associated box protein 


466 


97 


J to 




Mus sp. 

— — 


nuclear pore complex glycoprotein 

**£7 

pox 


464 


60 


^70 




Mus musculus 


ou^varjj-y nomoiog ouv^ynz 


1690 


88 


380 


AF227906 


Homo sapiens 


UDP-glucose rglycoprotein 
glucosyhransferase 2 precursor 


7851 


99 


381 


AF 11 8566 


Mus musculus 


hematopoietic zinc ringer protein 


1769 


92 




AiwUUo 1 y 


Homo sapiens 


unnamed protein product 


810 


100 


383 


AF227906 


Homo sapiens 


UDP-glucose : glycoprotein 
ghicosyitransferase 2 precursor 


7851 


99 


384 


AF1 17946 


Homo sapiens 


Link guanine nucleotide exchange 
factor II 


2363 


100 


JO J 


Ar iZjJyU 


-Urosopnua 
melanogaster 


Lo2G 


139 


41 






Hcmo scpi#n : s 


Human secreted protein clone 
cal06_19x protein s^aeace SHQ ID 


i 


387 


U 18795 


Saccharomyce 
s ccrevisiac 


Yel064cp 


206 


38 


388 


AF177388 


Homo sapiens 


cancer-amplified transcriptional 
coactivator ASC-2 


10748 


99 


389 


AJ002744 


Homo sapiens 


UDP-GaINAc:polypeptide N- 
aceiylgalactosaminyltransfcrase 7 


3469 


96 


390 


AF097366 


Homo sapiens 


cone sodium-calcium potassium 
exchanger 


3166 


100 






nuiuo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 




T TRIORS 


nnTVPOrifllQ 
UvJ rCglvUSt 


ankynn binding cell adhesion 
iiiuicLLiic ncuroioscin 


3967 


91 


393 


X65224 


frftHiiQ online 


UCUIUlaovUl 


4097 


78 


394 


X13916 


Unmn qbtiiptic 

liUJUU MXpidlo 


i^i^jU'-iccep tor rciaicu precursor ^aa 
-19 to 4525) 


4292 


99 






rirariA cstnipno 
nuuiu sapicus 




444 


98 


396 


AB017026 


Mus musculus 


oxysterol-binding protein 


2173 


98 


397 


AL035587 


Homo sapiens 


(IJ475N16 4 fKIAA0240 , l 


2393 


100 


398 


W74813 


Homo sapiens 


Human secreted protein encoded by 
gene 85 clone HSDFV29. 


722 


92 


399 


Y71110 


Homo sapiens 


Human Hydrolase protein-8 
(HYDRL-8). 


1637 


99 
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bE&CRkpribN 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


400 


AF039718 


Caenorhabditis 
elegans 


contains similarity to lupus LA 
protein homologs 


325 


43 


401 


AE000877 


Methanotherm 
obacter 
thermoautotro 
phicus 


conserved protein 


231 


36 


402 


Y27795 


Homo sapiens 


Human secreted protein encoded by 
gene No. 79. 


1539 


99 


403 


Z50853 


Homo sapiens 


CLPP 


615 


100 


405 


X03475 


Rattus 
norvegicus 


ribosomal protein L35a(aa 1-110) 


576 


99 


406 


AF 144237 


Homo sapiens 


LOMP protein 


252 


44 


407 


U20239 


Mus museums 


fibrosin 


288 


76 


409 


AL033378 


Homo sapiens 


dJ323M4.1 (KIAA0790 protein) 


6026 


99 


410 


X54326 


Homo sapiens 


ghitaminyl-tRNA synthetase 


7577 


99 


411 


X61585 


Bos taurus 


polynucleotide adenylyltransferase 


3715 


97 


412 


AF217190 


Homo sapiens 


MLEL1 protein 


5271 


99 


414 


G02815 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6896. 


314 


95 


415 


AJ245922 


Homo sapiens 


alpha-tubulin 8 


2370 


100 


416 


AF203032 


Homo sapiens 


neurofilament protein 


220 


21 


417 


Z97653 


Homo sapiens 


C380A1.2.1 (novel protein (isoform 

D) 


1567 


100 


418 


AJ404326 


Homo sapiens 


SR+89 


1871 


99 


419 


AJ404326 


Homo sapiens 


SR+89 


902 


64 


420 


AF134726 


Homo sapiens 


G9A 


5334 


99 


421 


L28125 


Podospora 
anserina 


beta transducin-like protein 


288 


39 


422 


W21733 


Homo sapiens 


NIP-1 encoded by clone 59. 


110 


72 


423 


S67970 


Homo sapiens 


ZNF75=KRAB zinc finger 


951 


76 


424 


L28035 


Mus muscuhis 


protein kinase C gamma 


3768 


98 


426 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence. 


555 


56 


427 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 

seqa^je. 


266 


49 


428 


X6U18 


Homo &>-;i5QS 


TTG-2a/RBTN-2a 


876 


100 


429 


Z96932 


Homo sapiens 


nuclear autoantigen fo 14 ';Oa 


496 


83 


430 


AJ277291 


Homo sapiens 


HELG protein 


678 


72 


431 


X82157 


Homo sapiens 


hevin 


3525 


99 


432 


AC007192 


Homo sapiens 


P85BHUMAN; PTDINS-3- 
KINASE P85-BETA 


3825 


99 


433 


AL021918 


Homo sapiens 


b34I8.1 (Kruppel related Zinc Finger 
protein 184) 


1713 


50 


434 


AF084464 


Rattus 
norvegicus 


GTP-binding protein REM2 


141 


29 


435 


AL049795 


Homo sapiens 


6J622L52 (novel protein) 


1756 


98 


436 


M14513 


Rattus 
norvegicus 


(Na+ and K+) ATPase, alpha(IH) 
catalytic subunit 


4269 


99 


437 


U33460 


Homo sapiens 


DNA-directed RNA polymerase I, 
largest subunit 


8777 


98 


438 


D87076 


Homo sapiens 


similar to human bromodomain 
protein BR140(JC2069) 


3067 


100 


439 


L43912 


Macaca 
mulatta 


mannose-binding protein A 


589 


93 




"pvO t HtLI 

UJ 1 / W 


Homo sapiens 


ha0946 protein is Kruppel-related. 


927 


49 


441 


U70976 


Homo sapiens 


arrestm 


2068 


99 


442 


B08069 


Homo sapiens 


A human beta-alanine-pyruvate 
aminotransferase (HAPA). 


2343 


99 


443 


API 00662 


Caenorhabditis 


contains similarity to ubiquitin 


166 


24 
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SMITH- 
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SCORE 


% 

IDENTITY 






elegans 


carboxyl-tenrunal hydrolase (Pfem: 
UCH-Lnrnm, score: 28.46) (Pfem: 
UCH-2Jimm, score: 47.53) 






AAA 

444 


D78017 


Rattus 
norvegicus 


NFI-A1 


2667 


98 


A AC 

445 


AL049569 


Homo sapiens 


dJ37C10 3 (novel ATPase) 


2418 


100 


A AO 

44o 


AJ242540 


Volvox carteri 
f. nagariensis 


hy droxypro line-rich glycoprotein 

TY7 UDPD 

U6~rus\jr 


165 


34 


AAQ 


A T 1 1 1 1 CO 

AJliJ352 


Homo sapiens 


dLNrdJ / protein 


2006 


100 




AJ 1J3352 


Homo sapiens 


juvirZi i protein 


1025 


96 


451 


AF170708 


Homo sapiens 


T-box protein TBX3 


3700 


99 


A M 

452 


AK002080 


Homo sapiens 


unnamed protein product 


1546 


99 


453 


L32911 


Homo sapiens 


Rieske Fe-S protein 


1239 


93 


454 


X51760 


Homo sapiens 


zinc finger protein (583 AA) 


1533 


57 


455 


Y01141 


Homo sapiens 


Secreted protein encoded by gene 7 
clone HTLFA90. 


1453 


99 


456 


AB 006631 


Homo sapiens 


Hie human homolog of mouse Cux-2 


6559 


100 


457 


AF067165 


Homo sapiens 


zinc finger protein 3 


977 


64 


458 


AF038169 


Homo sapiens 


unknown 


154 


38 


459 


W75214 


Homo sapiens 


Human secreted protein encoded by 
gene 19 clone HRSMC69. 


1180 


95 


460 


U97002 


Caenorhabditis 
elegans 


similar to acyl-CoA dehydrogenases 
and epoxide hydrolases;- Pfam 
domain PF00441 (Acyl-CoA_dhX 
Score=57,4, E-value=L7e-16,N=2; 
contains similarity to Pfam domain 
PF00702 (Hydrolase), Score-57.4, 
E-value^le-13, N=l 


583 


37 


461 


AK023114 


Homo sapiens 


unnamed protein product 


1041 


99 


462 


M93134 


Friend murine 
leukemia virus 


pol protein 


289 


44 


463 


AF055473 


Homo sapiens 


GAGE-8 


232 


47 


466 


Y51415 


Homo sapiens 


Human wild type pKe83 protein. 


2625 


100 


467 


Y51417 


787 


Human pKe83 splice variant protein 


2433 


100 


468. 


... V57936 


Homo sajdsusr 


Human trar sm rnbrane protein 
HTMPN-60. 


1629 


96 


469 


D38552 


Homo salens 


The hal539 protein is related to 
cyclophilin. 


2995 


100 


470 


Y70013 


Homo sapiens 


Human Protease and associated 
protein-7 (PPRG-7). 


3530 


100 


471 


AJ224747 


Homo sapiens 


C-terminal variant of hINADL 
including 2 amino acid exchanges 
and an insertion of 28 amino acids in 
frame. 


7969 


100 


472 


W99665 


Homo sapiens 


Human secreted protein clone 
dul57_12 protein. 


1546 


100 


473 


W99665 


Homo sapiens 


Human secreted protein clone 
dul57_12 protein. 


998 


98 


An a 

474 


X63526 


Homo sapiens 


homologue to elongation factor 1- 
gamma from A.salina 


2273 


99 


475 


X15940 


Homo sapiens 


ribosomal protein L31 (AA 1-125) 


644 


100 


476 


M60832 


Homo sapiens 


alpha-2 type VIE collagen 


3581 


99 


All 


AF039697 


Homo sapiens 


antigen NY-CO-31 


1213 


97 


478 


AF156929 


Sus scrofa 


inflammatory response protein 6 


1588 


83 


4/y 


A tnCAHIf 

Ar2o4717 


Homo sapiens 


r Y VE domain-containing dual 
specificity protein phosphatase 
FYVE-DSP2 


5610 


99 


480 


AF044578 


Homo sapiens 


putative DNA polymerase; POL4P 


2478 


94 


481 


X89750 


Homo sapiens 


TGCP protein 


1413 


100 
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SPECIES 


DESCRIPTION 


SMITH- 
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% 

IDENTITY 


482 


M93107 


Homo sapiens 


(R>3-bydroxybutyrate 
dehydrogenase 


1663 


96 


483 


T If Oil A 

U58334 


Homo sapiens 


Bbp/53BP2 


1556 


41 


484 


AF151538 


Homo sapiens 


deoxycytidyl transferase; Revlp 


4281 


99 


485 


IVA AAA,! 

Z98884 


Homo sapiens 


dJ467Ll.l (KIAA0833) 


699 


73 


486 


AJ243874 


Homo sapiens 


oligophrenin-4 


3682 


100 


487 


Z11737 


Homo sapiens 


flavm-wntaining monooxygenase 4 


2969 


100 


488 


X56123 


Mus muscuhis 


talin 


4353 


77 


489 


AJ278112 


Homo sapiens 


putative cell cycle control protein 


335 


23 


490 


W74843 


Homo sapiens 


Human secreted protein encoded by 
gene 115 clone HOVBA03, 


1013 


98 


491 


Y41337 


Homo sapiens 


Human secreted protein encoded by 
gene 30 clone HRDDV47* 


509 


36 


492 


X90530 


Homo sapiens 


ragB 


1926 


99 


493 


X90530 


Homo sapiens 


ragB 


1405 


99 


494 


X90530 


Homo sapiens 


ragB 


1893 


96 


495 


AL022394 


Homo sapiens 


dJ511B24.3 (KIAA0395 (probable 
homeobox protein)) 


4990 


99 


496 


Y11395 


Homo sapiens 


lanthionine synthetase C-like protein 
1 


2168 


100 


497 


AJ010119 


Homo sapiens 


Ribosomal protein kinase B (RSK-B) 


4001 


100 


498 


G01563 


Homo sapiens 


Human secreted protein, SEQ ED 
NO: 5644. 


330 


100 


499 


X54131 


Homo sapiens 


protein-tyros ine phosphatase 


10465 


99 


500 


G01082 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5163. 


549 


100 


501 


AC004142 


Homo sapiens 


similar to murine leucine-rich repeat 
protein; possible role in neural 
development by protein-protein 
interactions; 93% similarity to 
D49802 (PID:gl369906) 


3676 


100 


502 


AL1 17544 


Homo sapiens 


hypothetical protein 


1226 


100 


503 


AF203032 


Homo sapiens 


neurofilament protein 


5115 


99 


504 


AL034417 


Homo sapiens 


WC215D1 1.2 (similar to rat gene 33) 


2476 


100 


505 
506 " 


X69090 


sapiens 


190:^:.:v:.li 


7543 


99 


U56755 


Caenorhabditis 
elegans 


code?- for by C. elegans cDNA 
yk34b 1 ,5; coded for by C. elegans 
cDNA yk!3h!0.5; coded for by C. 
elegans cDNA yk46e8.5; coded for 
by C. elegans cDNA yk46d5.5; 
coded for by C. elegans cDNA 
yk43c2.5; coded for by C. elegans 
cDNA yk46e8.3; coded for by C. 
elegans cDNA yk43c2.3; coded for 
by C. elegans cDNA yk46d53; 
coded for by C. elegans cDNA 
ykl3fl03; coded for by C elegans 
cDNAyk34bl,3 


782 


55 


507 


AJ293309 


Homo sapiens 


NHP2 protein 


801 


100 


508 


U39045 


Rattus 
norvegicus 


cytoplasmic dynein intermediate 
chain 2B 


3241 


97 


509 


AF063231 


Mus musculus 


cytoplasmic dynein intermediate 
chain 2 


3159 


97 


510 


AF202893 


Mus musculus 


KiClb 


4336 


95 


511 


Y13115 


Homo sapiens 


serme/threonine protein kinase 


5071 


99 


512 


AB030207 


Homo sapiens 


G gamma summit 


364 


100 


513 


AF039571 


Homo sapiens 


peripheral benzodiazepine receptor 
interacting protein; PBR-IP/PRAX1 


495 


33 


514 


AB037883 


Homo sapiens 


Gb3/CD77 synthase 


1916 


99 
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SEQ 
ID 

NO: 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


Jij 


UtvoOo 


— — — - — — 

Escherichia, 
coli 


; - 

similar to 


1489 


100 






numo Sapiens 


— : — — ; — — ■ - — 

zinc finger protein Hsal2 


conn 
3290 


100 


S17 
j i / 


ATU330DO 


jyiuo muscuius 


apoptosis-linked gene 4, deltaC form 


2904 


78 


518 


/vruiyyzo 


JVLU5 IJlUSCUlUo 


protein Kinase 


1094 


90 


519 


M34513 


Homo sapiens 


omega protein 


317 


91 






Homo sapiens 


88kDa nuclear pore complex protein 


2313 


99 


3^61 


iUoolz 


Homo sapiens 


88kDa nuclear pore complex protein 


1561 


99 


3ZZ 


AT AO<T« 


Homo sapiens 


J a com O 1 /VTA A f\nC*l 

oA3>JriJo.i (KJAAO/o/ protein) 


2497 


100 




A 171 fliCTylO 
Ar 1 oOZ4y 


Homo sapiens 


six transmembrane epithelial antigen 
of prostate 


1790 


100 


524 


AB029012 


Homo sapiens 


K1AA1089 protein 


4933 


100 


3Z3 


AoUzooyj 


Homo sapiens 


vascular cadherin-2 


5962 


100 


526 


X74331 


Homo sapiens 


DNA primase (p58 subunit) 


1720 


100 


528 


AC007228 


Homo sapiens 


R31665 2 


1488 


47 


529 


X14830 


Homo sapiens 


acetylcholine receptor beta-subuntt 
preprotein 


2639 


100 


530 


U80446 


Caenorhabditis 
elegans 


coded for by C. elegans cDNA 
ykl72e63; coded for by C. elegans 
cDNA ykl58f7.3; coded for by C. 
elegans cDNA yk!58f7.5; coded for - 
by C. elegans cDNA ykl72e6,5 


420 


39 , 


531 


S76838 


Mussp. 


Dbs 


4821 


88 


coo 

532 


Z82215 


Homo sapiens 


dJ6802.2 (myosin, heavy 
polypeptide 9, non-muscle) 


9828 


100 


533 


AF245505 


Homo sapiens 


adiican 


277 


31 


OA 

534 


AF300612 


Homo sapiens 


N-acerylgalactosamine-4-O- 
sulfbtransferase 


993 


59 


535 


AL 12 1928 


Homo sapiens 
— 


bA18I14.3 (pleckstnn and Sec7 
domain protein) 


3333 


99 


J JO 


AJx/IU33 


Mus museums 


iroquois homeobox protein 6 


1724 


76 


33/ 


ArloU4 /J 


Homo sapiens 


Not2p 


2267 


100 


538 


AF071059 


Mus museums 


zinc finger RNA binding protein 


1089 


. 51 




AtUZj433 


Homo sapiens 


acNn-related protein 3-beta 


2219 


100 


540 


AC003030 


Homo sapiens 


Rivm i 


1^0! 


70 


Ml 


AC003030 


Homo £3piens 


R29828 1 


2294 


100 


542 


AL121889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 urotein 
(continues m AL023803)) 


2152 


100 


543 


AB006135 


Rattus 
norvegicus 


db83 


1238 


98 


KAA 

344 


UUZO30 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6731. 


644 


97 


343 


YU/393 


Homo sapiens 


transcription factor TFHH 


2373 


100 


340 


AL. 1 33343 


Homo sapiens 


bA386N14.1 (novel protem similar 
to a dual specificity phosphatase) 


964 


99 


547 


X83618 


Homo sapiens 


hydroxymethyighitaryl-CoA 
synthase 


2647 


100 


CvlQ 


AT 1 34 /ZD 


— : 

Homo sapiens 


JNU3/ 


4359 


99 


345* 


AJ3U33330 


Homo sapiens 


neurexin I-alpha protein 


6948 


99 




ADU3 /yUl 


Homo sapiens 


gene amplified in squamous cell 
carcinoma- 1 


5215 


99 


33Z 


AUU43034 


Homo sapiens 


rAK-oA 


885 


100 


333 


ArwUoy3 


Homo sapiens 


partial CDS 


4875 


99 


554 


AF002223 


Homo sanieiis 


mvntnhnlnrin rplntAH 1 




IUU 


555 


AC004893 


Homo sapiens 


similar to NEDD-4 (KIA0093); 
similar to P46934 (PID:gl 171682) 


1611 


100 


556 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


8328 


100 


557 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


11137 


100 
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NO: 


Accession 

NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


558 


X65873 


Homo sapiens 


kinesin heavy chain 


4860 


100 


559 


AJ277365 


Homo sapiens 


polygjutamine-containing protein 


592 


36 


560 


AF205600 


Homo sapiens 


transposase-iike protein 


407 


27 


561 


' X71125 


Homo sapiens 


glutaminyl-peptide cyclotransferase 


1914 


100 . 


562 


X71125 


Homo sapiens 


ghitaminyi-peptide cyclotransferase 


1456 


97 


563 


X54304 


Homo sapiens 


myosin regulatory light chain 


897 


100 


564 


AF250842 


Drosophila 
meianogaster 


multiple asters 


130 


23 


565 


Y58608 


Homo sapiens 


Protein regulating gene expression 
FRGE-L 


1619 


99 


566 


AL121893 


Homo sapiens 


bA189K2L5 (novel protein similar 
to retinoblastoma binding protein 
(RBBP9)) 


1012 


100 


567 


AL 117352 


Homo sapiens 


dJ876B10.2 (novel protein (ortholog 
ofratEX084)) 


3713 


99 


568 


AF228603 


Homo sapiens 


pleckstrin 2 


1841 


100 


569 


AF239243 


Homo sapiens 


histone deacetylase 7 


3244 


86 


570 


AF087695 


Mus musculus 


veli3 


989 


100 


571 


AB046381 


Homo sapiens 


testis-abundant ringer protein 


1346 


99 


572 


AC005551 


Homo sapiens 


R26529_2, partial CDS 


1020 


100 


573 


Y90290 


Homo sapiens 


Human peptidase, HPEP-7 protein 
sequence. 


274 


52 


574 


W76734 


Homo sapiens 


Human mDia Rho targeting protein. 


712 


32 


575 


AL121935 


Homo sapiens 


bA517H2.3 (t-complex 10 (a murine 
tcp.homolog)) 


853 


78 


576 


Y86217 


Homo sapiens 


Human secreted protein HWHGU54, 
SEQ ID NO: 132. 


2123 


99 


577 


AL121716 


Homo sapiens 


(1J202D23.2 (novel protein) 


6329 


99 


578 


AL121716 


Homo sapiens 


(IJ202D23.2 (novel protein) 


6329 


99 


579 


X92715 


Homo sapiens 


KRAB /C2H2 zinc finger protein 


3102 


97 


580 


X54637 


Homo sapiens 


protein tyrosine kinase 


5564 


98 


581 


X78817 


Homo sapiens 


pi 15 


1148 


44 


582 


AJ251245 


Rattus 
norvegicus 


SECIS binding protein 2 


3086 


71 


58? 
58V 


A7113125 


llovio rspiens 


E-l enzyme 


5S! 


100 


Mi?529 


Sus scrofa 


follistatin A 


lSKo i ■ 98 


585 


AF169677 


Homo sapiens 


leucine-rich repeat transmembrane 
protein FLRT3 


3403 1 100 

* 


586 


D87685 


Homo sapiens 


similar to human transcription factor 
TFHS (S34159). 


8083 


99 


587 


Y00876 


Homo sapiens 


Human LAPH-1 protein sequence. 


2110 


100 


588 


Y99674 


Homo sapiens 


Human GTPase associated protein- 
25. 


2111 


99 


589 


D86973 


Homo sapiens 


similar to Yeast translation activator 
GCN1 <P1:A48126) 


12033 


99 


590 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen triple 
helix repeat containing protein) 


1979 


100 


591 


Y57396 


Homo sapiens 


Human lysoenzyme LYC4 
polypeptide. 


814 


100 


592 


AJ297743 


Mus musculus 


torsinB protein 


1448 


85 


593 


AF164796 


Homo sapiens 


NADRLubiquinone oxidoreductase 
MLRQ subunit homolog 


469 


100 


594 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


749 


94 


595 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


824 


100 


596 


Y77123 


Homo sapiens 


Human neurotransmission-associated 
protein (NTAP) 998868. 


2102 


98 


597 


AF215703 


Drosophila 


KISMET-L long isoform 


1880 


65 
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Kirk* 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






melon ogodicr 








598 


AF070447 


Homo sapiens 


barrier-to-autointegration factor 


290 


90 


coo 

oyy 




Plasm odiuxn 

TBlf*tiMinin% 
laix^paiuin 


liver stage antigen 


372 


22 




Y7QR7R 


MUS mUSCUlUS 


JNJvlU 


202 


53 


601 


AB004109 


Cricetulus 
gnseus 


phosphatidylserine synthase II 


2262 


92 


602 


U94988 


Mus musculus 


Nutpl 


2912 


89 






Mus musculus 


Nulpl 


2800 


86 


OU4 


ArUUoz04 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homo log 


2850 


100 


605 


AF006264 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homolog 


2530 


100 


Aft* 
OVO 


AoZxOU 


Homo sapiens 


KanUArl 


2929 


100 


0U7 




Homo sapiens 


KanGAPl 


1843 


97 


608 


AF160909 


Drosophila 
melanogaster 


BcDNA.LD03471 


943 


58 


610 


X74801 


Homo sapiens 


gamma subunit of CCT chaperonin 


2745 


99 


£.1 1 
Oil 


ALU3 1 4/7 


Homo sapiens 


dJ167A19.1 (novel protein) 


1608 


100 


612 


V/T 1 ATI 

Y71072 


Homo sapiens 


Human membrane transport protein, 
MTRP-17. 


445 


100 


613 


X16396 


Homo sapiens 


precursor polypeptide (AA -29 to 
3 15) 


1749 


100 


014 


A&UUvzol 


Homo sapiens 


unnamed protein product 


1814 


99 


OiO 


A DAI 1 1 0O 

Am) I J 1Z6 


Homo sapiens 


KIAA0556 protein 


5761 


99 


olo 


U19361 


Petromyzon 
marinus 


NF-180 


205 


21 


AIT 
01 / 


A l?A>|CCCC 

AtU4jjjj 


Homo sapiens 


wbscrl 


1208 


100 


Olo 




Homo sapiens 


wbscrl alternative spliced product 


1318 


100 


i*tO 

oiy 




Feiis catus 


ribosomal protein L4 1 


128 


100 


ozu 


V 1 *7 1 /CO 

Y l /loy 


Homo sapiens 


A6 related protein 


1819 


100 


OZ1 


X 1ZU0D 


Homo sapiens 


nNop56 


2956 


99 


ozz 


Ar J ///Do 


Homo sapiens 


ubiquitin specific protease 16 


2998 


100 


Ox j 


Ar J 1 


Homo sapiens 


UAOl 


3866 


100 


624 


ALQTG297 


Homosapittu 


hypothetical - 


1227 


99 1 


OZJ 




Homo sapiens 


BC273239 l 


3398 


99 


626 


Z68747 


Homo sapiens 


imogen 38 


2024 


99 


627 


Z68747 


Homo sapiens 


imogen 38 


1958 


97 


628 


Y70229 


Homo sapiens 


Human RNA-associateU protein- 10 
(RNAAP-IO). 


3424 


99 


629 


AF191492 


Homo sapiens 


nasopharyngeal carcinoma associated 
gene protein-8 


613 


100 


630 


AF1 19664 


Homo sapiens 


transcriptional regulator protein 
HCNGP 


1574 


100 


Ojl 


Ar 1 iyo04 


Homo sapiens 


transcriptional regulator protein 
JiCNOr 


1150 


89 


0>Z 


i i /©4y 


Homo sapiens 


ganglioside-induced differentiation 
associated protein l 


1839 


98 


633 


X55740 


Homo sapiens 


S'-nucleotidase 


3012 


100 


0.34 


ArUJyOoo 


Homo sapiens 


antigen NY-CO-3 


931 


100 


635 


AF1 19662 


Homo sapiens 


E46 protein 


2424 


100 


636 


AB007836 


Homo sapiens 


Hic-5 


2544 


100 


o37 


AF077818 


Musmuscuhis 


syntrophin-associated serine- 

rtiMAnmp Tvrrtt^tn Irrnoon 

LUAGU1UUG JJI V/LClil *MIItt"v 


2027 


44 


638 


AL035455 


Homo sapiens 


CU1018E9.1 (VAMP (vesicle- 
associated membrane protein)- 
associated protein B and C) 


150 


26 


639 


AF078844 


Homo sapiens 


hqp0376 protein 


416 


81 
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ID 

MA, 

nu. 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 




I DJH77 


JCSCUcXlCQia 

\AJ1A 


\jj\r ixjy, was \jj\r nyi ana 




100 


641 


AK024442 


Homo sapiens 


FLJ00032 protein 


1677 


56 


fid? 




nonio sapiens 


noosuniai pruicin oxo 


J4U 


100 






AnllUS IaUUS 


nooooniai proiem oz 


1 CIA 


98 


644 


AB002348 


Homo sapiens ' 


KIAA0350 protein 


5186 


99 


646 


Y96202 


Homo sapiens 


IkappaB kinase (IKK) binding 
protein, izriDo. 


1178 


98 


Oh/ 




-5— ! 

Mus musculus 


JNK-oinaing protem JNKBPl 


4609 


81 


0*f5 




Arabidopsis 

Ulallalla 


contains similarity to isoamyl 

aceiaie-nyoxoiyzmg 

esterase-gene_i±MQB225 


407 


44 






Homo sapiens 


Unknown gene product 


858 


99 


OJl 




Homo sapiens 


diabetes mellitus type I auto antigen 


253 


66 




AOUljJ 


Homo sapiens 


zinc linger 41 


4349 


100 


653 


X53330 


Platynereis 
dumerilii 


H4 protein (AA 1-103) 


523 


100 


654 


AC003682 


Homo sapiens 


R27945 2 


2558 


100 


ODD 


A80473 


Mus musculus 


rabl9 


596 


56 






JRattus 
norvegjeus 


unknown protein 


201 


95 


Ml 


J\\Aj\)0\) 1** 


Homo sapiens 


similar to RFP transforming protem; 
smuiar to r\<*oi5 (rtu:giJzoi t) 


1331 


99 


OJo 




— : 

Homo sapiens 


protein phosphatase 6 


1666 


100 


ojy 




Homo sapiens 


zinc finger protein 


2803 


99 


oou 




Homo sapiens 


rloj4/ l 


3184 


96 


661 


X79204 


Homo sapiens 


ataxin-1 


4195 


99 


662 


X17620 


Homo sapiens 


Nm23 protein 


965 


99 


663 


AB015617 


Homo sapiens 


ELKS 


1501 


80 


664 


Z56281 


Homo sapiens 


interferon regulatory factor 3 


2331 


100 


665 


AJ248283 


Pyrococcus 
abyssi 


LACTOYLGLUTATHIONE 
LYASE (EC 4.4. 1,5) 
METHYLGLYOXALASE) 
(ALDOKETOMUTASE) 

//"IT ~**V* "/'AT K QT~ T\ 


254 


40 


OOv 




— — 

Homo s&p-i&ns 


U3 SUKIN r-SpeCUlC ZUOiv'.* pTOicil . 


- ..• .... , . 
8819 


99 ' 


oo/ 




Homo sapiens 


ud siiKNr-speciiic zU0Ki> s. -mem 


8589 


97 


668 


AF153450 


Manduca sexta 

— _ - . 


juvenile hormone esterase bindii}^ 
protein 


225 


32 


ooy 


Ainorioft 
Arzz/iy© 


Homo sapiens 




7231 


99 


670 


X99586 


Homo sapiens 


SMT3C protem 


441 


87 


671 


Z61589_cdl 


Homo sapiens 


17-AUG-1998 DNA encoding a 
human OC-2 protein. 


2593 


100 


672 


AJ132702 


Mus musculus 


ATFa-associated factor 


3240 


88 


0/3 




Homo sapiens 


potassium large conductance 
calcium-activated channel beta 3a 
subunit 


1486 


100 






-— : 

Homo sapiens 


Human secreted protem, SEQ ID 
NO: 6142. 


558 


99 


675 


G01246 


Homo sapiens 


Human secreted protein, SEQ ID 
NU: 5327. 


141 


77 


676 


AB016839 


Homo sapiens 


mobl 


419 


42 


oil 


X->8o970 


Homo sapiens 


similar to myosin heavy chain: 
Containing ATP/GTP-binding site 

motif AfP-1nrm\ 


161 


28 


678 


U83115 


Homo sapiens 


non-lens beta gamma-cry stallin like 
protein 


8569 


99 


679 


AF203687 


Homo sapiens 


prolactin regulatory element-binding 
protein 


2181 


100 
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SEQ 
ID 

NCi: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 


680 


M27685 


Mus musculus 


ultra-high sulphur keratin 


650 


58 


681 


U04968 


Crieetiifus 
prise us 


tl 1 1 P I f*oti rf p» AVPicinn rpnati* nrnttfin 

iiuviwuiiuc cA^iaiuu repair uiuicin 


j/ 1Z 


97 


682 


AFI19663 


Homo saniens 


Cr-nrotern tTfimma- 1 9 ciihimit 

VJ Lyi vj twm g<U 1 1 ill a 1 x. otiULUUl 


JJO 


in a 


683 


G03733 


Homo saniens 


Human secreted nrnfptn ^JFO TO 

HUIIIHII gvvldvU pxUlwXLLj &±-i\J WJ 

NO: 7814. 




1 AA 


684 


X67699 


Homo sapiens 


CDw52 antieen 


907 


1 AA 


68S 


AF022789 


Homo sapiens 


ubiouitm h vdml win p enyvmp T 




1UU 


686 


AJ001006 


Mus musculus 


EMeg32 protein 


938 


96 


687 


W03516 


Homo saniens 


Prostaglandin T"jP rpppntnr 


lou*f 


1 AA 
1UU 


688 


AF019661 


TVfus tnuscuhis 


*A*UX pi Ut-GoavlUv wxalxly i OlYLrVJ 




1 AA 


689 


AF156557 


Homo sapiens 


stomatin related protein 


2036 


100 


690 


G03960 


XXUXXIU OaX/lCLLO 


tin TT! an CM*r^tf»fl nrntftin QT?f"\ TTPk 
XxUllloil aCvlOlCU pi UlCLIi, OCU 1 1 J 

NO- RfMI 




100 


691 


AF161512 


Homo sapiens 


HSPC163 


738 


100 


692 


AL031115 


Homo sapiens 


ZXDA, ZXDB (zinc finger X-linked 

piwcuij 


4298 


100 


693 


L40410 


Homo sapiens 


thyroid receptor interactor 


806 


100 


694 




rujiuo Sapiens 


UA I a I nKwlw-JollNlJliNij 

PROTEIN-like; similar to P22059 


2533 


99 


695 


AF1 60411 


XlUX V CglA/USf 




4144 


52 


696 


Y58168 


Homo sapiens 


Human hydrolase homologue HHH- 
4. 


2144 


100 






riouio Sapiens 


dopamine responsive protein DRG-1 


1613 


100 


698 


Y41741 


Homo sapiens 


Human PRO704 protein sequence. 


1323 


100 


U77 


AT 1 ^^fW* 


Unknown 


/prediction=(method: m, genscan w 
version: m, 1.0 ,,H , score: ,,tt 109.13 n "); 
/preaicuon— (inetnocL 


825 


48 


700 


Y96870 


Homo sapiens 


Human goose-type lysozyme 


1032 


100 


701 






uene wiin sirnnaniy to rat Kidncy- 
specixic ^xwo^ gene 


1190 


100 


.702 






vjc/iic wiui sm iiaiiiy io rax Kione^ •• 

*>pCWXM* ^XVk^y gvllC 


937 ? 95 1 

I 


703 


A&42832 


Homo saniens 


WXUKUXI 


3756 s. 100 


704 


S52624 


Homo saniens 


iinlfftAum 

"'""""Til 


185 


m 


705. 


AF005081 


Homo saniens 


SXklXI SpvWilll' piviciii 


652 




706 


Y16793 


Homo saniens 


Ireratin t\mp I 

HVl OXULy Ijr pw X 


2232 


100 


707 


Y44985 


Homo sapiens 


Human epidermal protein-2. 


455 


69 


708 


AF1 13220 


XJIUU1U SMXL/lCIld 


lVli3 X r WrU 


686 


100 


709 


Y44985 


Homo saniens 


rxuiuan vpiuci ijiai pi UlClil"^ . 


408 


65 


710 


Y16132 


Homo saniens 


CDT6 


1874 


100 . 


711 


Y68775 


Homo saniens 


Amino JlP.lH QfyniPfifj> /vf a Viirman 
ifciii ii i\j aviu dvutxvjXM? \j± a 11 mil all 

Dhosnhorvlatirm pfferitnr 


2407 


100 


712 


X63422 


Homo sapiens 


H(+VtransiM)rbJiff ATP synthase 


209 


100 


713 


AF] 69968 


Mus musculus 


DNA bind in p" nrntpin FiPSIPT * 

X-'l^J* vlilUXUg pii/tciu X>/X^tJXVX 


1467 


79 


714 


X52563 


Bos taurus 


nermahilitv inrrftflQfnfrmvit^in 

P« liMwmtjf lxxvi ^<iai i ik pi «JLC11 1 


383 


29 


715 


AJ277739 


Homo sapiens 


RPBllblalpha protein 


480 


98 


716 


AL135791 


Homo sapiens 


i/niVfcUiuj i £tiliw uil^vl prtjLcin i 


401 


98 


717 


AF223466 


Homo sapiens 


HT015 protein 


1311 


97 


719 


AF1 17383 


T-Intnr* <uvni**nc 

XXUUIU DdpiwLu 


piovCilutl prULCin i J 3 ii 1 J 


746 


100 


720 


Z98743 


Homo sapiens 


dJ181C9^ (Rho GTPase activating 
protein 8 (RhoGAP, p50RhoGAP)) 


324 


100 


721 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


722 


G01436 


Homo sapiens 


Human secreted protein, SEQ ID 


418 


96 
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SCORE 


% 

IDENTITY 








NO: 5517. 






723 


AF282919 


Musmuscuhis 


Zfp228 


349 


49 


724 


AB023191 


Homo sapiens 


KIAA0974 protein 


2953 


100 


725 


AL031778 


Homo sapiens 


(1J34B21 . 1 (novel BZRP 
(benzodiazepine receptor (peripheral) 
(MBR, PBR, PBKS, IBP, 
Isoquinoline-binding protein)) LIKE 
protein) 


920 


100 


726 


AL021939 


Homo sapiens 


(U352A20.2 (aldehyde 
dehydrogenase family protein) 


1764 


100 


727 


AF182426 


Rattus 
norvegicus 


arylacetamide deacetylase 


791 


42 


728 


Y08565 


Homo sapiens 


UDP-GaINAc:polypeptide N- 
aretylgalactosainmyto^nsferase 


3331 


99 


729 


AF155135 


Homo sapiens 


novel retinal pigment epithelial cell 
protein 


1652 


99 


730 


AL078606 


Arabidopsis 
thaliana 


putative protein 


277 


55 


731 


Y73352 


Homo sapiens 


HTRM clone 1732368 protein 
sequence. 


1720 


100 


732 


AF178432 


Homo sapiens 


SH3 protein 


3302 


100 


733 


Y 17832 


Human 
endogenous 
retrovirus K 


env protein 


223 


34 


734 


Y28859 


Homo sapiens 


Human mesoderm induction early 
response protein ER1. 


2067 


98 


735 


U09355 


Oryctolagus 
cunicuhis 


protein phosphatase 2A1 B gamma 
subunit 


2352 


99 


736 


Y94922 


Homo sapiens 


Human secreted protein clone pv6_l 
protein sequence SEQ ID NO:50. 


724 


99 


737 


AB027003 


Musmuscuhis 


protein phosphatase 


378 


84 


738 


Ar 112200 


Homo sapiens 


NADH-oxidoreductase B18 subunit 


739 


100 


739 


AF 112200 


Homo sapiens 


NADH-oxidoreductase B18 subunit 


613 


88 


HA A 


AF3U2154 


Homo sapiens 


SPG protein 


6556 


100 


•41 


B256F1 


Hemp sapiens , . 

, 


HursV?r. secrctec protein xtt— r: ™ 
encixled by gene 17 SI ; .\> ID Ni .70. 


1410 


99 I 




t mno 


Homo sapiens 


X123 


1237 


99 


743 


L27479 


Homo sapiens 


X123 


1206 


97 


HA A 


~\J £.£.1 AC 

Y66745 


Homo sapiens 


Membrane-bound protein PRO I i -Jo. 


588 


99 


HA< 


AJUUIUI9 


Homo sapiens • 


ring finger protein 


1292 


99 


/40 


A65453 


Sus scrofa 


tubulin-tyrosine ligase 


1882 


94 


/4/ 


Y 57897 


Homo sapiens 


Human transmembrane protein 
HTMPN-21. 


1173 


100 


748 


AF151069 


Homo sapiens 


HSPC235 


1694 


96 


/4y 


Ar JoZ4U4 


Homo sapiens 


mitochondrial uncoupling protein 1 


1674 


100 


750 


AL121993 


Homo sapiens 


dJ776P7.1 (Novel protein) 


2500 


99 


751 


AF 149825 


Homo sapiens 


PACSIN3 


2253 


100 


"ICO 

752 


AL008o35 


Homo sapiens 


(U510H16.2 (high-mobility group 
protein 2-lDce 1) 


3026 


99 


/53 


Y57914 


Homo sapiens 


Human transmembrane protein 
HTMPN-38. 


1124 


100 


04 


AF285109 


Homo sapiens 


septin 3 isoform B 


1766 


100 


05 


AF004161 


Oryctolagus 
cunicuhis 


peroxisomal Ca-dependent solute 
carrier 


2371 


95 


756 


219585 


Homo saniertQ 


thf fim Knennn /tin ^ 




IUU 


757 


AP001745 


Homo sapiens 


similar to zinc finger 5 protein 


1857 


100 


758 


AF190664 


Mus musculus 


LMBR2 


555 


72 


759 


AF090326 


Mm museums 


AE-1 binding protein AEBP2 


1540 


97 


760 


AL096677 


Homo sapiens 


dJ322G13.3 (novel protein similar to 


999 


94 
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bovine and mouse beta-soluble NSF 

aLutuijjxiciii protein ^aJNAJr-Dexaj ) 






761 


AC003007 


Homo sapiens 


Unknown gene product (partial) 


649 


96 


7£7 
/OZ 


TTJJJC177 

UooJ /Z 


Bos taurus 


nDosomai protein ozy 


230 


73 


764 


Y90899 


Homo sapiens 


Dl-like dopamine receptor activity 
modifying protein SEQ ID NO: 1 . 


1152 


100 


/CO 


U88169 * 


Caenorhabditis 
elegans 


similar to molybdoterin biosynthesis 
mxjmZd proteins 


1204 


65 


/oo 


AT 1 1 QCA£ 


Homo sapiens 


aj5yiuzu.3.i (novel DnaJ domain 
protein, similar to mouse and bovine 
cysicine suing protein j 


1091 


100 


767 


AK024693 


Homo sapiens 


unnamed protein product 


3767 


100 


/Oo 


71 1 CI B 

Z.I1M0 


Homo sapiens 


nisnay 1-tKiN a syntnetase 


2582 


100 


*7£A 

/oy 


A139IO 


Homo sapiens 


LDL-receptor related precursor (AA 
-19 to 4525) 


25529 


100 


T7f\ 




Arabidopsis 
thaliana * 


uontams 5 rr|irU4UU WU40, Ci-beta 
repeat domains. 


333 


33 


//I 


AJbJUi /Oo 3 


Mus musculus 


T A "Kill I Tt — * ««.j«A.«J«. 

LAlNr-liKe protein 


1246 


91 


772 


at i £1 cno 

AJL161578 


Arabidopsis 
thaliana 


putative protein 


335 


46 


773 


AT % £ t no 

AL161578 


Arabidopsis 
maliana 


putative protein 


333 


47 


774 


AY008271 


Homo sapiens 


helicase SMARCAD1 


5264 


99 


775 


Y21591 


Homo sapiens 


Human secreted protein (clone 
CC332-33). 


1127 


96 


776 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


"7*7*7 
111 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


775 


WoooOj 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


779 


A 17 1 t\C >(01 

Ar lyo4ol 


Homo sapiens 


KING ringer protein; FXY2 


3644 


100 


/oU 


ALU J 342/ 


Homo sapiens 


dJ7o9N13.1 (KIAA0443 protein.) 


1609 


54 


781 


AB026187 


Homo sapiens 


protocadherin-Xa 


5244 


100 


.782 . 


£>24458 . 


Homo &rz!&&j . 


Human s^cr^t^d protein iequenre 
encoded oy gene 22 SEQ ID NO: 83. 


.1002 


10C> 


TOO 


A DAT71M 

AoUZ /2oV 


Homo sapiens 


cyclin-E binding protein 1 


5421 


100 




Uvzyio 


Homo sapless 


Human secreted protein, SEQ ID 
NO: 6997. 


627 


100 


/JO 


AJ243 ozz 


Homo sapiens 


type I transmembrane receptor 


A C aCA 

4560 


100 


/OO 


AJZ4D5ZU 


Homo sapiens 


type I transmembrane receptor 


4624 


100 


7R7 




Homo sapiens 


OxU-ancnorea protem pi 


3340 


99 


7RR 
/oo 


AT A'l^O') 
ALA) j I /5Z 


nomo sapiens 


i4!7nSl?< 1 /T>T FTP A TTV/TJ «Aiml 

OJ/UorD.i (I'UlAllVii novel 
urOUagen aipna i jljjsjq protem j 


z/jy 


1 AA 




A Ti a 174c 


xiumu sapiens 


occz*to protein 


/^A7 
OOUZ 


1 AA 


7Qfl 

/ 7U 


Api A79A-5 

/vr iuf ^uo 


XJ.UIUU SopicDJS 


ainXTn z-oiij uing protem 


7AAC 
ZUvo 


1 AA 


701 




numo sapiens 


procollagen aipna £\y ) 


OUU 


1A 
J4 


792 


AL031055 


Homo sapiens 


dJ28H20^ (novel protein) 


1267 


100 


70^ 


1 joiy*r 


797 

to f 


Human secreted protein 


ZU^l 


yy 


704 


A RAO 01 77 
/VDUZolZ/ 


riomo sapiens 


mannc^yhTansferase 


Zi JO 


96 


70S 
/7J 


A/"7MV777ff. 
Av*UU /ZZo 


nomo sapiens 


KJ100D z 


Z/J(S 


79 


796 


AL049482 


Arabidopsis 
thaliana 


putative protein 


436 


47 


797 


AC004528 


Homo sapiens 


R32184 3 


891 


91 


70ft 


ARfl17R^A 
/VDvj / 03 v 


numo sapiens 


VT A A 1 A A A -nwxtam 

jviAAi4uy protem 


7<^7 


1 AA 
10U 


799 


X53793 


Homo sapiens 


5 ( half of the product is homologues 
to Bacillus subtiis SAICAR 
synthetase, 3' half corresponds to the 
catalytic subunit of AIR carboxylase 


2232 


100 
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ouU 


Y99350 


Homo sapiens 


Human PRO 1378 (UNQ715) amino 
acid sequence SEQ IDNO:33. 


1343 


100 


oUl 


AoU42o30 


Homo sapiens 


junctophilin type3 


1225 


47 


ouZ 


Ai>U r zy324 


Rattus 
norvegicus 


rirl2Q-iamiry protein TLP120B 


3916 


90 




A±iUz9324 


Rattus 
norvegicus 


riP120-iamiry protein TIP120B 


4961 


90 


5U4 


AF251040 


Homo sapiens 


putative nuclear protein 


2119 


. 100 


805 


AB033281 


Homo sapiens 


F-box and WD-repeats protein beta- 
TRCP2 Lsoform C 


2879 


100 


806 


U87305 


Rattus 
norvegicus 


transmembrane receptor UNC5H1 


3257 


90 


807 


AF 118889 


Rattus 
norvegicus 


b-tomosyn isoform 


3155 


97 


808 


AF226993 


Rattus 
norvegicus 


selective LIM binding factor 


8793 


95 


809 


W19919 


Homo sapiens 


Human Ksr-1 (kinase suppressor of 
Ras). 


3939 


99 


olO 


AL031782 


Homo sapiens 


djvuwo.i (JfUTA I ivii novel 
Collagen alpha 1 LIKE protein) 


1546 


100 


oil 


AU002542 


Homo sapiens 


similar to C. elegans F11A10.5; 80% 
similarity to Z68297 (PIDrgl 130619) 


2294 


100 


an 
oiZ 


Uoiz4o 


Homo sapiens 


copine I 


606 


52 


813 


AF242552 


Gallus gailus 


retinovin 


945 


34 


814 


X52332 


Homo sapiens 


zinc finger protein 10 


1651 


93 


QIC 

515 


X52332 


Homo sapiens 


zinc finger protein 10 


2423 


99 


816 


Y09631 


Homo sapiens 


PIBF1 protein 


2935 


99 


817 


X71997 


Rattus 
norvegicus 


myosin I 


3883 


98 


818 


AY004877 


Mus museums 


cytoplasmic dynein heavy chain * 


11105 


98 


819 


Y27196 


Homo sapiens 


Human cyclic nucleotide 
phosphodiester PDE8B(E) amino 
acid sequence. 


3790 


100 


820 


AF081947 


Mus museums 


tektin 


1134 


81 


821 


AL035106 - 


He; sapiens 


dJ99?C: teontinurs k 
Em:;rj>*45i;^ as bA269H4.1) 


871 


100 


822 


AF022795 


Homo sapiens 


TGF beta receptor associated protein- 
1 


385 


24 




Ar 01 5770 


Mus museums 


radical fringe 


1422 


82 






Homo sapiens 


expressed-Xq2 8 STS protein 


1444 


99 




A77371 


Mesocricetus 
auratus 


COR1 


641 


78 


826 


AB014576 


Homo sapiens 


KIAA0676 protein 


296 


79 


oZf 


AHM-y/ii 


Homo sapiens 


dJ875H3.1 (APK1 antigen) 


1584 


72 


SOB 




Homo sapiens 


disrupted in Schizophrenia 1 protein 


4418 


100 


829 


Z31560 


Homo sapiens 


sox-2 


1683 


100 


gin 

W0 


AF295773 


Homo sapiens 


ral guanine nucleotide dissociation 
stimulator 


4717 


99 




A 'DA/1 1 ftO£ 


Homo sapiens 


GCK family kinase MINK-2 


6866 


100 




T (\AQAQ 


Saccharomyce 
s cerevisiae 


mitochondrial transporter protein 


338 . 


35 


HQS 


AJUV/ULZ 


Mus musculus 


Fish protein 


704 


94 






Homo sapiens 


nucleolar phosphoprotein pi 30 


3455 


99 


835 


U10991 


Homo sapiens 


G2 


8436 


98 


836 










99 


837 


X58288 


Homo sapiens 


protein-tyrosine phosphatase 


7734 


99 


838 


X56958 


Homo sapiens 


ankyrin (brank-2) 


9631 


100 


839 


AC024791 


Caenorhabditis 
elegans 


contains similarity to beta-lactamases 


370 


24 
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840 


D83197 


Homo sapiens 


ankynn repeat protein 


802 


99 


51 A1 


Arvjjfl I 


Serin us 

CSQ8T1& 


neurofilament medium subunit 


192 


31 


OH* 


ATAOj / /■£ 


xiomo sapiens 


■ . „ — — ; 

similar to Homo sapiens ribosomal 
protein L10 encoded by GenBank 
Accession Number L25899 


990 


96 


QA'l 


TT7/WA"* 


Homo sapiens 


vjatja transport protein 


2992 


98 


QAA 

0*T*T 


VllfLAK 


Homo sapiens 


uroplakin II 


897 


100 




1/Z1U04 


Homo sapiens 


similar to rat genera] mitochondrial 

Wftof>lV nfAfiOPPinA tiro ^n. nnn _ L i fi\1 A 

maaix processing protease mKN A 
fit atmpp'* 


2710 


99 


846 


AF199S99 


xiuiuu sapicxio 


in lemaiin-ricK protein, !NrL,j 


7047 


100 


847 


AF1 99^99 


xiuxiiu Sapiens 


in iemaiin-r ick protein; NrLo 


5472 


100 


848 


AUVTrOy 


fiumu Sapiens 


ciOugallon IaClOr- 1 -Deiu 


1162 


100 


849 


AC007204 


Homo sapiens 


BC273239 1 


2277 


67 


OJU 


nvUUJvOx 


liOino sapiens 


"DORBQft 1 
i\ZooOv 1 


2401 


100 


RSI 


AT 191 Sft"3 


Homo sapiens 


OAJDoNz.i (novel protein) 


353 


61 


852 


Z48475 


Homo sapiens 


glucokinase regulator 


3155 


99 


Ojj 




Homo sapiens 


dJ3/ri lo.2 (S>H3-domain binding 
protein 1) 


1884 


98 


854 


AF233323 


Homo sapiens 


Fas-associated phosphatase- 1 


390 


36 


855 


AF062741 


Rattus 
norvegicus 


pyruvate dehydrogenase phosphatase 
isoenzyme 2 


447 


.80 


oDO 


VI 1A 1 1 

X i 1411 


Homo sapiens 


pristanoyl-CoA oxidase 


3595 


98 


O J / 


My/ lOO 


Strongylocentr 
otus 

purpuratus 


tektin Al 


290 


46 


R5R 

OJO 


AoUUl IUj 


Homo sapiens 


hippocalcin-like protein 4 


995 


100 


RSO 


at io*»/yi 


Homo sapiens 


putative 38.3kDa protein 


1795 


100 


RAH 
OOU 


A 170 OR 1 1*7 

Ar Z70 11/ 


Homo sapiens 


homeobox protein OTX2 


1477 


93 


oOi 




Kattus 
norvegicus 


golgi peripheral membrane protein 
p65 


1820 


81 


862 


X16901 


Homo sapiens 


30kb subunit of RAB30 /74 


1284 


100 


503 


M1Z14U 


Homo sapiens 


envelope protein 


202 


81 


864j 


AP? 51459 


Homo r:jpiens 


HSPG109 


315 


98 




AL« 11/7753 


Homo sapiens 


dj718Pl 1.1.1 (novel class II i 444 
aminotransferase similar to serine 
palmotyhransferase (isoform 1)) £ 


100 


ouu 


M"771 


luuxus 


alpha- 1 -macroglobulin 


227 


45 


867 


AF979/;/M 

ill ^ / ArUUJ 


nuulu Sapiens 


gephyrin 


3785 


100 


868 


X75285 


irlUo lliUoi/Uiuo 


fibulin-2 


3258 


87 


869 


X82494 


Homo sapiens 


fibuJin-2 


3407 


99 


870 


A 1997743 


JV1U3 uluSCUlUS 


torsinB protein 


169 


43 


871 


AJ975ttn 


XTrtrort comonc 
JTZUUIU CMtpiCllS 


phosphoJipase C-beta-la 


6258 


99 


872 


AF073344 


Homo sapiens 


ubiquhin-specific protease 3 


256 


43 


Of J 


VQ1QSS 


Homo sapiens 


Human cytoskeleton associated 
protein 10 (CYSKP-10). 


535 


100 


91 A 
O 




Homo sapiens 


Cdc42-interacting protein 4 


1136 


53 


o / J 




Homo sapiens 


ubiquitm-conjugating BIR-domain 
enzyme APOLLON 


627 


100 


R7A 


VAR^Rrt 

J TtkJOU 


xiomo sapiens 


Human breast tumour-associated 
protein 47. 


2537 


98 


877 


AF182198 


Homo sapiens 


intersectin 2 long isoform 


8764 


99 


878 


L17308 


Gossypium 
hirsutum 


proline-rich cell wall protein 


192 




879 


AF177169 


Homo sapiens 


tropomodulin 2 


1769 


100 


880 


W03627 


Homo sapiens . 


Human follicle stimulating hormone 
GPRN-tenninal sequence. 


210 


23 
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% 
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ool 


A r A0 1 A£0 

ALU/ 1 Uoo 


Homo sapiens 


JTOA£TT^1 e *» 

al206T>15.3 


2615 


99 


882 


AC005498 


Homo sapiens 


R31665 2 


318 


82 


ooj 


AF165518 


Homo sapiens 


MAGOH isoform 


182 


94 




D21211 


Homo sapiens 


protein tyrosine phosphatase (FTP- 
BAS, type 3) 


368 


43 


885 


U13045 


Homo sapiens 


nuclear respiratory factor-2 subunit 
beta 1 


869 


62 


000 


A52836 


Homo sapiens 


tryptophan hydroxylase (AA 1 - 444) 


2320 


98 


887 


X51466 


Homo sapiens 


elongation factor 2 


4460 


100 


055 


AB039903 


Homo sapiens 


interferon-responsive finger protein 1 
long form 


1096 


98 


889 


X51760 


Homo sapiens 


zinc finger protein (583 AA) 


3130 


100 


890 


AJ243396 


Homo sapiens 


voltage-gated sodium channel beta-3 
subunit 


1024 


100 


891 


W67928 


Homo sapiens 


Fragment of human secreted protein 
encoded by gene 4. 


391 


100 


892 


AB020598 


Homo sapiens 


peptide transporter 3 


3017 


100 


893 


Y66648 


Homo sapiens 


Membrane-bound protein PROl 120. 


4722 


99 


894 


Y66648 


Homo sapiens 


Membrane-bound protein PROl 120. 


3606 


96 


895 


A29218_cd 
1 


Homo sapiens 


19-NOV-1998 DNA encoding G- 
protein coupled 7 TM receptor with 
AXOR15 activity. 


2178 


100 


896 


AJ000332 


Homo sapiens 


Glucosidase 0 


5063 


99 


897 


X98259 


Homo sapiens 


M-phase phosphoprotem 8 


1085 


100 


898 


X57110 


Homo sapiens 


c-cbl protein 


4849 


99 


899 


X63652 


Homo sapiens 


mter-alpha-trypsin inhibitor heavy 
chain ITIH1 


3376 


98 


900 


X85134 


Homo sapiens 


RB protein binding protein 


2816 


99 


901 


LI 1672 


Homo sapiens 


zinc finger protein 


2047 


58 


902 


Y85565 


Homo sapiens 


Human homologue of UNC-53 (Hs- 
UNC-53/2) sequence. 


369 


83 


903 


X54871 


Homo sapiens 


ras related protein Rab5b 


1094 


100 


an a 
904 


Z98265 


Homo sapiens 


plakophilin 3 


4065 


100 


905 


AL035295 


Homo sapiens 


hypothetical protein 


959 


99 




AF05 1-782' 


llomc iOpfcns 


sli^i^st-as 1 


5J1 


-35 * 


OAT 


nr208536 


Homo sapiens 


iiucleotiac binding protein; NBP 


1372 * 


100 


908 


U79240 


Homo sapiens 


serLie/threonine protein kinase 


2365 


98 


909 


U79240 


Homo sapiens 


serine/ ib ^onine protein kinase 


2386 


99 


910 


AJ 132545 


Homo sapiens 


protein kinase 


2921 


100 


911 


AJl 32545 


Homo sapiens 


protein kinase 


1637 


99 


912 


AL121733 


Homo sapiens 


hypothetical protein 


1344 


99 


913 


Y67579 


Homo sapiens 


Human death inducer-obliterator 1 
(DIO-1) polypeptide. 


1586 


100 


914 


X87342 


Homo sapiens 


Human giant larvae homologue 


5317 


99 


915 


X87342 


Homo sapiens 


Human giant larvae homologue 


3495 


96 


916 


M94362 


Homo sapiens 


laminB2 


2357 


93 


on 
917 


AJ011654 


Homo sapiens 


triple LIM domain protein 


3432 


100 


mo 

918 


a T1 *» 1 onn 

AJ131899 


Rattus 
norvegicus 


proline rich synapse associated 
protein 1 


5776 


88 


919 


AF054986 


Homo sapiens 


putative transmembrane GTPase 


1816 


100 


920 


U95822 


Homo sapiens 


putative transmembrane GTPase 


1237 


100 


921 


Y11588 


Homo sapiens 


apoptosis specific protein 


1492 


100 


922 


X84195 


Homo sapiens 


acyiphosphatase 


510 


100 




U I Loot. 


nomo sapiens 


mterferon- induced leucine zipper 
protein 


1409 


99 


924 


AE000660 


Homo sapiens 


hADV36Sl 


573 


100 


925 


AF126245 


Homo sapiens 


acyl-Coenzyme A dehydrogenase-8 
precursor 


2162 


100 
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A EAA1 f\zTO 

AbU019oo 


Deinococcus 
radlodurans 


hypothetical protein 


147 


27 


927 


W81S76 


Homo sapiens 


EB V-induced G-protein coupled 
receptor (EBI-2) polypeptide. 


1778 


100 




U01317 


Homo sapiens 


beta-globin 


687 


94 


929 


X98333 


Homo sapiens 


organic cation transporter 


2933 


100 


930 


Y91444 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 42 SEQ ID 
NO: 165. 


1401 


100 


931 


Y91644 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 43 SEQ ID 
NO:3I7. 


1243 


100 


932 


D90279. 


Homo sapiens 


collagen alpha 1(V) chain precursor 


569 


39 


933 


Z31560 


Homo sapiens 


sox-2 


1587 


96 


934 


AF147790 


.Homo sapiens 


transmembrane mucin 12 


3047 


99 


935 


Z85996 


Homo sapiens 


match: multiple proteins; match: 
Q08151 P28185Q01111 Q43554; 
match: Q08150 Q40195 P20340 
Q39222; match: Q40368 P36412 
P40393 Q40723; match: CE01798 
Q38923 Q40191 Q41022; match: 
Q39433 Q40177 Q40218 Q08146; 
match: P10949P11Q23 Q16948 
Q20337; match: Q25389 P25228 
P20336 P05713; match: P35276 
Q08147 P17609 P22128; match: 
Q15771 P36410 P35291; GTP- 
binding 


726 


94 


936 


AB041533 


Homo sapiens 


sperm antigen 


1054 


38 


937 


X91906 


Homo sapiens 


voltage-gated chloride ion channel 


3914 


100 


938 


AB032481 


Homo sapiens 


homeobox transcription factor 


1744 


100 


939 


AFU1106 


Homo sapiens 


protein serine/threonine phosphatase 
4 regulatory subunit 1 


4682 


99 


940 


Y17999 


Homo sapiens 


DyrklB protein kinase 


3331 


99 


. 941 
"942" 


AF305^T{ 


Homo sapiens 


rayroglotalin \- 455 




Homo sapiens 


cingulin ; ! 5939 


943 


AK024s ! 42 


Homo sapiens 


FLJ00032 protein 


1616 


61 


944 


Y359H 

1 


Homo sapiens 


Extended human secreted protein 
sequence, SEQ ID NO. 160. 


262 


35 


945 


ABO 15320 


Homo sapiens 


sigmalB subunit of AP-1 clathrin 
adaptor complex 


599 


71 


946 


Z82287 


Caenorhabditis 
elegans 


ZK550.2 


229 


35 


947 


D84223 


Homo sapiens 


leucyl tRNA synthetase 


6207 


99 


V48 


U49057 


Rattus 
norvegicus 


rA9 


3846 


62 


949 


AK000568 


Homo sapiens 


unnamed protein product 


1659 


100 


950 


AL021578 


Homo sapiens 


dJ453C12.6.1 (uncharacterized 
hypothalamus protein (isoform 1)) 


257 


42 


951 


AB032435 


Homo sapiens 


differentiation-associated Na- 
dependent inorganic phosphate 
cotransporter 


3063 


99 




AF1 10532 


Homo sapiens 


uncoupling protein UCP-4 


1561 


100 


953 


X83587 


Mus museums 


1A13 protein 


1420 


59 


954 




TlVvnn osmipno 


(U545L17J.1 (novel protein) 


386 


53 


955 


Y87600 


Homo sapiens 


Human tatty acid synthase-like 
protein (HFASLP), 


2377 


100 


956 


Y99421 


Homo sapiens 


Human PR01433 (UNQ738) amino 
acid sequence SEQ ID NO:292. 


522 


55 
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SEQ 
ID 

ptvi 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


957 


U68535 


Mus musculus 


aldo-keto reductase 


451 


73 


yjo 




1 Jg| It! 




1594 


57 


959 


U72194 


Mus musculus 


muskelin 


3947 


99 


Ofift 




TVocAnn lis* 

melanogaster 


lAiiDioo gene prouuct 


277 


54 






lvius muscuius 


laOZU 


983 


82 


962 


Y67315 


Homo sapiens 

— : : 


Human secreted protein BL89_13 
amino acid sequence. 


3916 


99 


Qfil 


I Of 31 J 


Homo sapiens 


Human secreted protein BL89_13 
amino acid sequence* 


3916 


99 


5»0*+ 




Rattus 
norvegicus 


homeodomain 159..341 


1821 


96 


yo.? 


Z*y / ojZ 


nomo sapiens 


ajjzyAj.j (1viaauo4oU protein) 


3581 


99 


966 


W88995 


Homo sapiens 


Polypeptide fragment encoded by 
gene 146. 


176 


39 


967 


U12465 


Homo sapiens 


ribosomal protein 135 


604 


100 


yoo 


API <1 CA1 


Homo sapiens 


CCjI-45 protein 


1101 


78 


969 


W74865 


Homo sapiens 


Human secreted protein encoded by 
gene 137 clone HMWIF35. 


1348 


98 


970 


L21936 


Homo sapiens 


succinate dehydrogenase flavoprotein 
subunit 


703 


100 


971 


AJ133521 


Drosophiia 
buzzatii 


protease, reverse transcriptase, 
ribonuclease H, integrase 


194 


23 


y/z 


ACQOoOlV 


Homo sapiens 


N-acetylgakctosaminy]traDsferase; 
similar to Q10473 (PID:gl 709559) 


3271 


100 


973 


Z81317 


Schizosacchar 
omyces pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


y/4 


iVIi /ooj 


Homo sapiens 


acidic ribosomal phosphoprotein (PO) 


792 


100 


975 


U22829 


Mus musculus 


P2Y purinoceptor 


399 


40 


y/o 


AL 132772 


Homo sapiens 


dJ1013A22.1 (hepatic nuclear factor 
4, alpha) 


2466 


99 


07*7 




— ; 

Homo sapiens 


£NrylL 


1550 


43 


978 


J04031 


Homo sapiens 


MDMCSF (EC 1.5.1.5; EC 3.5.4.9; 
til. vo. 4.3 J . 


2824 


63 


y iy 


ArloO/ 1.) 


Homo jjVucjIs 


taxcl resistant associate J pio^ 


217 


/6 


980 


AF136715 


Homo sapiens 


taxol resistant associated r t otein 


306 


95 


751 




Caenorhabditis 
elegans 


TXT OA 1 " " ' 

ZK520.1 


1109 


44 


982 


AJ295149 


Homo <mnien<t 




l doh 


an 
yy 


983 


AL021331 


Homo sapiens 


dJ366N23.3 (KIAA0173 and 
Tubulin-Tyrosine Ligase LIKE) 


1492 


100 


984 


AL161501 


Arabidopsis 
thaliana 


putative adenosine deaminase 


370 


38 



TABLE 3 



SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 4.259e-14 97-120 


3 


BL00298 


Heat shock hsp90 proteins family 
proteins. 


BL00298A 10.97 1.000e-40 74- 
119 BL00298E 27.30 1.000e-40 
321-376 BL00298F 11.21 l.OOOe- 
40 409-464 BL00298H 20.50 
1.000e-40 553-607 BL00298C 
16.40 2.286e~40 186-230 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00298B 15.64 1290e-39 134- 
181 BL00298G 24.57 5.345e-39 
465-520 BL002981 30.07 7.818e- 
34 661-715 BL00298D 17.97 
6.226e-33 242-282 


A 

4 


PR00237 


RHODOPS1N-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 43 16e-13 57-82 


5 


PD02454 


HI! PROTEIN ALU SUBFAMILY 
WARNING ENTRY NUCLEAR 
PHOSPHO. 


PD02454B 11.61 4.309e-17 75- 
103 


6 


DM00864 


EGF-LIKE DOMAIN. 


DM00864A 15.21 7.429e-0998- 
119 


7 


PR00237 


RHODOPSIN-LDCE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 1.750e-ll 29-54" 
PR00237D 8.94 7.000e-09 138- 
160 PR00237B 13.50 8.250e-09 
61-83 


9 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e-15 272-289 


10 


BL00139 


Eukaryotic thiol (cysteine) proteases 
cysteine proteins. 


BL00139D 9.24 4.400e-l 1 391- 
408 BL00139A 10.29 7.51 le-09 
67-77 


12 


BL01113 


Clq domain proteins. 


BL01 113B 18.26 9.294e~19 689- 
725 BL01113C 13.184.857e-ll 
757-777 Bt01113D7.472.161e- 
10 790-800 


13 


BL01113 


Clq domain proteins. 


BL01113B 18.26 3.813e-14 599- 
635 BL01113C 13.18 4.857e-ll 
667-687 BL01113D7.472.161e- 
10 700-710 


14 


BL00594 


Aromatic amino acids permeases 
proteins. 


BL00594A 16.75 6.531&-10 50-94 


15 


BL01047 


Heavy-metal-associated domain proteins. 


BL01047B 19.73 4.913e-13 707- 
728 


16 

~ T8 " : ~ 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 7.462e-18 310- 
330 PR00625B 13.48 3. 93 <>e- 15 

S40 S5L . _ p ; \ 


" BL00615 


C-type lectin domain pst ^ins. 


BL00615A ^68 3.>l0e-09 144- 
162 


20 


PR00741 


GLYCOSYL HYDROLASE FAMILY 
29 SIGNATURE 

r 


PR00741D 16.11 9^S2e-21 175- 
195 PR00741F 14.66 9.262e-21 
243-265 PR00741B 1423 1.947e- 
18 128-145 PRD074IG9.29 
2.180e-17 318-340 PR0O741C 
9.16 7328e-17 147-166 
PR00741H 10322.141e-13 351- 
374 PR00741A9.24 3.596e-13 
89-105 PR00741E 13.39 3.535e- 
12 215-232 


22 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.647e-20 117- 
148 BL00107B 1331 1.000e-16 
182-198 


23 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 
157 


24 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 L600e-23 126- 
157 


97 




riecepior tyrosine Kinase class Ji proteins. 


BL0UZ391> -0.13 2.324e-I6 91- 
139 


28 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3250e-10 681-694 
BL00018 7.41 6.400e-10 717-730 


29 


BL00018 


EF-hand calcium-binding domain 


BL00018 7.41 3.250e-10 681-694 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins. 


BL00018 7.41 6.400e-10717-730~ 


30 


BL01113 


CI q domain proteins. 


BL01 1 13A 17.99 9.308e-09 54-81 


33 


PD01168 


SYNTHETASE LIGASE PROTEIN 
ALANYL. 


PD01 168L 9.47 1.667e-09 401- 
416 


34 


PD01168 


SYNTHETASE LIGASE PROTEIN 
ALANYL. 


PD01 168L 9.47 1.667e-0941 1- 
426 


36 


PR00426 


C5A-ANAPHYLATOXIN RECEPTOR 
SIGNATURE 


PR00426D 10.59 3.618e-12 110- 
122 


37 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 2.049e-10 1080- 
1135 


38 


BL00350 


MADS-box domain proteins. 


BL00350 20.79 1.000e-40 1-55 


40 


BL00123 


Alkaline phosphatase proteins. 


BL00123B 19.31 1.000e-40 90- 
133 BL00123C 24.61 1.000e-40 
145-195 BL00123E 22.25 LOOOe- 
40 304-358 BL00123G 26.01 
1.000<H*0 438-488 BL00123F 
19.03 8.714e-35 364-399 
BL00123A 10.80 9.000e-24 52-77 
BL00123D 12.73 1.000e-17216- 
229 


44 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BDSfDI. 


PD00066 13.92 2.800e~14 346-359 
PD00066 13.92 4.600e-14 486-499 
PD00066 13.92 1.000e-13 374-387 
PDO0066 13.92 6.000e-13 458-471 
PD00066 13.92 2.714e-12 234-247 
PD00066 13.92 3,143e-12 430-443 
PD00066 13.92 8.714e-12 514-527 
PD00066 13.92 3.739e-l 1402-415 
PD00066 13.92 2.038e-10 318-331 


45 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 2.946e-10 180- 
217 


47 


BL00649 


G-protein coupled receptors family 2 
proteins. 


BL00649C 17.82 1.682e-10475- 
501 BL00649B 20.68 7.387e~09 
417-463 


50 


PD00066 i 


prote;m zinc-fimger mfl;i» 

BINDI. 


?D0COf5 !3.92 8.20Oe-164 !*>458 
PD00066 13.92 5.846V15 305-3U 
PD00066 13.92 l.OOOe- 14 221-234 
PD00066 13.92 l.OOOe- 14 417-430 
PD00066 13.92 2. 800e- 14 249-262 
PD00066 13.92 2.800e-14 277-290 
PD00066 13.92 8.800e- 14 333-346 
PD00066 13.92 9.400e-14 361-374 
PD00066 13.92 4.000e-13 389-402 
PD00066 13.92 6.57 le- 12 473-486 


51 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 l.OOOe-40417- 
464 BL00226B 23.86 3.348e~35 
251-299 BL00226C 1323 1.429e~ 
24 316-347 BL00226A 12.77 
1.857e-15 151-166 


52 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 


PR00217C 10.91 5.648e-09 133- 
149* 


53 


BL00232 


Cadherins extracel hilar repeat proteins 
domain proteins. 


BL00232B 32.79 1.000e-40 143- 
191 BL00232A 27.72 2350e-28 
49-82 BL00232B 32.79 7.052e-21 
252-300 BL00232C 10.65 6.625e- 
20 250-268 BL00232B 32.79 
1.314^-11367-415 BL00232C 
10.65 9.308e-10 470-488 


54 


BL00303 


S-100/ICaBP type calcium binding 


BL00303B 26.15 8.759e-23 125- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






protein. 


162 BL00303A 21.77 1.000e-21 
82-119 


58 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 1.000e-15 242- 
261 PR00378B 13.80 9250e-13 
109-129 


59 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 9.040e-12 120- 
140 


60 


BL00280 


Pancreatic trypsin inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 6.727e-38 238-282 
BL00280 24.61 1.514e-30 294-338 


65 


BL01019 


ADP-ribosylation factors family proteins. 


BL01019A 13.20 1.222e-ll 43-83 


68 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 5.091e-13 188- 
212 PR00237G 19.63 7.207e-13 
268-295 PR00237All.484.375e- 
1124-49 PR00237C 15.69 
3.057e-10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e-10230- 
255 PR0Q237B 13.50 9.438e-10 
57-79 


70 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.938e-28 31-70 


71 


PR00830 


ENDOPEPTTDASE LA (LON) SERINE 
PROTEASE (S16) SIGNATURE 


PR00830A 8.41 8.759e-12 348- 
368 


72 


BL00120 


Lipases, serine proteins. 


BL00120B 1137 2.149e-10 148- 
163 


77 


PR00753 


1 - AMINOC YCLOPROPANE- 1 - 
CARBOXYLATE SYNTHASE 
SIGNATURE 


PR00753E 8.01 3,552e-ll 191- 
216 PR00753D6.852.778e-09 
131-153 


78 


PR00506 


D21 CLASS N6 ADENINE-SPECIFIC 
DNA METHYLTRANSFERASE 
SIGNATURE 


PR00506C 19.40 8.017e-0996- 
119 


82 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.571e-16 436- 
467 


84 


BL00675 


Sigma-54 interaction domain proteins 
r ATP-binding ;tgLn A proteins. 


BL00675A ?4.*6 8.800e-10256- 

300' " . - • 


85 


BL00G27* 


'Homeobox' domain proteins. 


1>L00027 26.43 2286e-30 1 17-: JO 


87 


BL00250 


TGF-beta family proteins. 


BL00250A 21 .24 6.786e-36 264- "* 
300 BL00250B 27.37 1.450e-26 
328-364 


91 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.250e-17 10-35 
BL00215A 15:82 6.000e-16221- 
246 BL00215A 15.82 7.857e-12 
108-133 BL00215B 10.44 9.526e- 
11 168-181 


92 


BL00027 


*Homeobox' domain proteins. 


BL00027 26.43 9.526e-24 324-367 


95 


PR00094 


ADENYLATE KINASE SIGNATURE 


PR00094C 12.94 1.000e-08 119- 
136 


96 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR MMUNOGLO. 


PD02327B 19.84 2.091e-09 143- 
165 


97 


BL00752 


XPA protein. 


BL00752B 19.17 7.3 09e-09 28-72 


98 


PRD0876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.662.268e-10 135- 
149 


99 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 1227 9.824e-12 122- 
141 


100 


BL00027 


'Homeobox* domain proteins. 


BL00027 26.43 7.429e-31 1 18-161 


101 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 6.870e-12 370-387 
BL00028 16.07 6.885e-ll 398-415 
BL00028 16.07 8.269e-ll 342-359 
BL00028 16.07 4300e-10 229-246 
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ID 
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ACCESSION 
NO. 
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RESULTS* 








BL00028 16.07 6.100e-10 258-275 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 665- 
679 PR00048A 10.52 8.500e-14 
581-595 PR00048A 10.52 9.250e- 
14 637-651 PR00048A 10.52 
2.059e-12 609-623 PR00048A 
10.52 2.588e-12 469-483 
PR00048A 10.52 7353e-12 553- 
567 PR00048A 10.52 2.895e-ll 
525-539 PR00048A 10.52 4.3 16e- 
11 441-455 PR00048A 10.52 
5.263e-l 1413-427 PR00048B 
6.02 2. 125e-I0 569-579 
PR00048B 6.02 4.938e-10 513- 
523 PR00048A 10.52 5.696e-10 
497-51 1 PR00048B 6.02 8.875e- 
10 429-439 PR00048B 6.02 
L000e-09 457-467 PR00048B 
6.02 6.684e-09 485-495 


103 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 11.94 5364e-22 31-50 
PR0O195B 9.47 1.783e-21 56-74 
PR00195C 11.50 3.455e-21 126- 
144 PR00195D 11.76 8J14e-21 
175-194 PR00195F 16.20 8.500e- 
20217-237 PR00195E9.82 
8.650e-20 194-211 


104 


BL01113 


Clq domain proteins. 


BL01I13A 17.99 1.865e-09 121- 
148 BL01 1 13A 17.99 5.846e-09 
82-109 


105 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 6.400e-ll 70-99 
BL0O420A 20.42 8.525e-10 73- 
102 BL00420A 20.42 5.708e-09 
85-114 


108 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE , 


PR00860B 7.04 2.929e-20 27-41 
PR00S50A S AC 5.500^-165-18 4 
PR00860C 9.61 1.4'M^l^ 41-51 


hi 
112 


BL01031 


Heat shock hsp2 ; ; proteins family profile. 


BL01031C 17.68 6.400e~10 122- 
147 


114 


DM01840 


kw SPAC24B1 1.09 R07E5.13. 


DM01 840B 22.04 2.688e-40 59- 
103 DM01840A 10.95 9.571e-13 
31-43 


IK 
113 


BL01 126 


Elongation factor Ts proteins. 


BL01 126A 18.48 2.317e-30 46-89 
BL01126B 13.15 7.387e-19 116- 
135 BL01 126C 9.20 9.735e-ll 
190-203 


116 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 4.375e-21 35-85 


118 


BL00437 


Catalase proximal heme-ligand proteins. 


BL00437A 18.82 1.000e-4049- 
101 BL00437B 16.28 1.000e-40 
114-168 BL00437C 21.86 l.OOOe- 
40 190-239 BL00437D 25.72 
1.000e-40 248-301 BL00437E 
23.95 1.000e-40 327-379 




x51aH/14<I 


Ubiquitin carboxyl-texminal hydrolase 
family 1 cysteine activ. 


BL00140D 22.64 8.274e-14 164- 
208 BL00140C 11.80 5.444e-10 

77-109 


120 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 6.712e-10 95- 
148 


122 


BL00203 


Vertebrate metaUothioneins proteins. 


BL00203 13.94 1.000e-40 16-62 


123 


PR00041 


CAMP RESPONSE ELEMENT 


PR00041D 7.95 2.906e-09 24-41 
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RESULTS* 






otNUlNu (CREB) PROTEIN 
SIGNATURE 






PRQQU41 


CAMP RES PON Sb bLEMENT 
BINDING (CREB) PROTEIN 

OT/"*VT A TT TO "D 


PR00041D 7.95 2.906e-09 24-41 


125 


BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061C 7.86 3.250e-10 212- 
222 


126 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.400e-25 251-290 


127 


PR00318 


ALPHA G-PROTEEN (TRANSDUCIN) 
SIGNATURE 


PR00318D 16.28 1.900e-34219. 
248 PR00318B 14.793.455e-27 
168-191 PR00318C 12.09 7.000e- 
23 197-215 PR0O318A 7.84 
1.600e-19 35-51 PR00318E 7,23 
2.500e-12 265-275 


128 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 9.743e-10 67-89 
PR00927B 14.66 4.575e-09 69-91 


130 


BL00824 


Elongation factor 1 beta/betaVdelta chain 
proteins. 


BL00824B 9.21 7.750e-22 133- 
153 


131 


BL00824 


Elongation factor 1 beta/betaVdelta chain 
proteins. 


BL00824C 14.58 1.000e-40 166- 
204 BL00824D 14.04 1.621e-38 
204-239 BL00824B 9.21 7.750e- 
22 133-153 BL00824E 12.49 
1.000e-19 247-263 


132 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 9.222e-13 1209- 
1228 


133 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 9.222e-13 1 168- 
1187 


134 


PR00708 


ALPHA- 1 -ACID GLYCOPROTEIN 
SIGNATURE 


PR00708D 14.67 1.000e-27 141- 
168 PR00708C 11.77 1.643e-25 
98-120 PR00708B 15.15 2.174e- 
2473-95 PR00708E 13.33 
1.600e-21 189-207 PR00708A 
14.40 ?.636e-21 5 1-70 


135 i 




TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


prool :*:: 12.2? 8^68e-r, I, 

145 


136 


FF0G-J23 


Ank repeat proteins. 


PF00023A 16.03 3^50e-10^;;- 
217 


137 


BL00471 


Small cytokines (mtercrine/chemokine) 
C-x-C subfamily signat 


BL00471 23.92 7.480e-10 42-90 


140 


PR00205 


CADHERIN SIGNATURE 


PR00205B 1 1.39 5.582e-10 328- 
346 PR00205B 11.39 9.018e-10 
543-561 


141 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 7.704e-09 976- 
1027 


143 


PR00979 


TAFAZZIN SIGNATURE 


PR00979E 10.83 5.950e-26 192- 
214 PR00979A 11.91 8.773e-25 
63-83 PR00979C 12.16 6.400e-19 
108-124 PR00979D 12.38 7.955e- 
19 170-185 PR00979F 10.14 
3.382e-15 230-244 PR00979B 
15.59 5.636e-15 94-106 


145 


DM00686 


kw REPLICATION REP 28K 17.7K. 


DM00686C 14.14 7.720e-09 111- 
131 






LLAoo 1A /VjNJLI IJd Li I ULHROMR C 

SIGNATURE 


T^Tl AA/* A Af\. t ^ O 1 t\l\f\ _ % rm firm 

PR00604D 15.86 1.000e-17 87- 
104 PR00604B 12.73 9.591e-16 
57-73 PR00604C 10.21 8.200e-12 
73-84 PR00604E 10.13 1.000e-ll 
106-117 PR00604A 11.13 8.800e- 
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11 44-52 PR00604F 8.60 l.OOOe- 
10 123-132 


147 


BL00107 


Protein kinases Al JP-binding region 
proteins. 


BL00107A 18.39 3.864e-15 266- 
297 BL00107B13316.143e-ll 
335-351 


148 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 8.448e-09 67-81 


149 


PR00069 


ALDO-KETO REDUCTASE 
SIGNATURE 


PR00069D 19.36 i.857e-30 187- 
217 PR00069A 16.01 7.429e-25 
41-66 PR00069E 18.14 3.100e-22 
235-260 PR00069C 16.03 7.000e- 
20 151-169 PR00069B 11.33 
8.071e-19 101-120 


150 


BL00027 


Homeobox' domain proteins. 


BL00027 26.43 2.688e-27 139-182 


151 


PD02906 


SYNTHASE I PSEUDOURIDYLATE 
PSEUDOURIDINE LYASE TR 


PD02906C 24.17 7.070e-22 165- 
200 PD02906B 15.35 8393e-15 
114-127 PD02906A 10.84 6.500e- 
09 71-84 


153 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479A 19.86 5.091e-12 891- 
914 BL00479B 12.57 1.837e-ll 
915-931 


158 


BL00027 


Homeobox' domain proteins. 


BL00027 26.43 6.786e-31 143-186 


160 


BL00422 


Granins proteins. 


BL00422C 16.18 7.750e-l2 420- 
448 


162 


PR0O625 


DNA J PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 9.297e-l 1 62-82 


164 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 6.182e-10 347- 
386 


166 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 83-97 
PR00860A 5.46 1.000e-18 61-74 
PR00860C9.61 1.900e- 15 97-107 


167 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.052e-09 196- 
218 


169 


BL00514 


Fibrinogen beta and gamma chains C- 
ionninal dermic proteins. - 

* 


PL00514C 17.41 1.346e-39 316- 
3S: Bi,0C5| !G 15.98 z£41e.^-* * 
471-501 BL00514H 14.95 6.571e- 
27 510-535 BL00514E 14.28 
1.273e-16 388^405 BL00514D 
15.35 9.100e-15 369-382 
BL00514B 16.42 4.857e-14 260- 
276 BL00514F 11.65 9.690e~14 
416-431 BL00514A 11.68 8200e- 
11 149-159 


170 


BL00514 


Fibrinogen beta and gamma chains C- 
tenninal domain proteins. 


BL00514C 17.41 l.346e-39268- 
305 BL00514G 15.98 234Ie-34 
423-453 BL00514H 14.95 6.571e- 
27462-487 BL00514E 14.28 
1.273e-16 340-357 BL00514D 
15.35 9.100e-15 321-334 
BL00514B 16.42 4.857e-14 212- 
228 BL00514F 11.65 9.690e-14 
368-383 BL00514A 11.68 8.200e- 
11 101-111 


171 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514G 15.98 2.241e-34 385- 
415 BL00514H 14.95 6.571e-27 
424-449 BL00514C 17.41 4.632e- 
24230-267 BL00514E 142S 
1.273e-16 302-319 BL00514D 
1535 9.100e-15 283-296 
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BLUVjIW 16.42 4.857e-14 212- 

228 BL00514F 11.65 9.690e-14 

330-345 BL00514A 11.68 8.200e- 
ii 1 m iii 

11 lUl-111 


173 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.400e-29 1 19-162 


1 IH 




U KWZK632.12 YDR313C 
ENDOSOMALEL 


UMU1V7UU 8.60 5.1 19e-15 1391- 
1404 


1 10 


OlAfV / to 


Chitinases family 19 proteins. 


BL00773C 9.42 8.000e-092-16 


182 


PR00109 


TYROSINE KINASE CATALYTIC 

TVM jf A TVT OIPXT k. TT TT» T7 

LMJMA1N SIGNATURE 


PR00109B 12.27 9.163e-14 141- 
160 


183 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA-. 


PD01937A 6.68 3.475e-09 221- 
232 


185 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.946e-23 247-272 
BL00845 16.43 1.628e-21 107-132 


186 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e-ll 525- 
541 


187 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e-l 1 497- 
513 


188 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN H. 


DM01803A 10.51 1.000e-09 
1081-1102 


189 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 5.091e-15 69-82 


190 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194C6.38 1.900e-35 145- 
174 PR00194E8.74 3.250e-30 
231-257 PR00194D9.571.500e- 
26 175-199 PR00194B 1024 
5.200e-24 120-141 PR06194A 
7.86 4.857e-21 84-102 


192 


PD02042 


IRON-SULFUR ELECTRON 
TRANSPORT AROMATIC 
HYDROCARB. 


PD02042B 16.75 5.154e-09 131- 
146 PD02042A 21.13 5.909e-09 
94-121 


193 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.3 1 2.200e-10 2-15 


195 


BL00463 


Fungal Zn(2VCys(6) binuclear cluster 
v-^atn proteins. 


BL00463 8.22 5.071e-09 1 1 1-1 23 . 1 


19c | i'R00118 


BETA-LACTAMASE A 
SIGNATURE 


PR00118F 16.41-, ?,386e-u9 165- 
181 


197 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM0021 5 19.43 5.424c 09 234- 
267 


198 


BL00660 


Band 4.1 family domain proteins. 


BL00660A 31.50 5.500e-l 1 714- 
767 


199 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.820e-13 70-93 


202 


PR00009 


TYPE I EGF SIGNATURE 


PR00009A 14.15 5.345e-15 971- 
987 PR00009C 14.11 8.773e-13 
996-1008 PR00009D 16.83 ■ 
8.000e-ll 1008-1018 PR00009C 
14.11 1.882e-09 892-904 


203 


BL00025 


P-type Trefoil' domain proteins. 


BL00025 17.17 4.536e-19 38-59 


205 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 7300e-10 165-178 


206 


PR00168 


SLOW VOLTAGE-GATED 
POTASSIUM CHANNEL SIGNATURE 


PR00168D 12.88 6.865e-ll 67-86 


207 


BL00025 


P-type Trefoil* domain proteins. 


BL00025 17.173.423e-2039-60 
oi^uuuzj i /.i / o./jue-io oo-ioy 


209 


BL00646 


Ribosomal protein S13 proteins. 


BL00646B 21.42 6.100e-30 1 10- 
143 BL00646A25.82 6.192e-29 
14-62 


210 


PR00138 


MATRDON SIGNATURE 


PR00138D 16.56 3.605e-25 279- 
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305 PR00138C 16.41 3.000e~24 
218-247 PR00138E6.01 8.714e- 
13 314-328 PR00138A 15.14 
9.538e-13 134-148 PR00138B 
15.82 4.522e-12 188-204 


211 


DM01206 


CORONAVTRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.429e-12 386- 
406 DM01206B 10.69 1.247e-10 
384-404 DM01206B 10.69 
5.068e-10 388-408 


212 


PD01941 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 


PD01941A 14.81 1.000e-40 163- 
217 PD01941B 15.02 9.705e-30 
420-467 PD01941E 15.92 8.714e- 
23 837-884 PD01941C 19.96 
8.200e-20 508-563 PD01941D 
27.18 1.600e-16 661-710 
PD01941F 28.52 9.645e-15 1005- 
1060 


213 


BL00362 


Ribosomal protein S15 proteins. 


BL00362 24.67 8J13e-09 330-373 


214 


BL00115 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


BL001 152 3.12 2.125e-09 1 178- 
1227 BL00115Z3.12 6.096e-09 
1164-1213 


215 


BL00038 


Myc-type, 'helix-Ioop-helix' dimerizanon 
domain proteins. 


BL00038B 16.97 7.600e-18 125- 
146' BL00038A 13.61 1.474e-13 
102-118 


216 


BL01108 


Ribosomal protein L24 proteins. 


BL01108A 20.33 2.241e-22 49-82 
BL01108B 1 1.40 8.457e-10 96- 
107 


217 


PR00381 


KINESIN LIGHT CHAIN SIGNATURE 


PR00381A 9.55 1.321e-10360- 
378 


222 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 2.358e-26 1166- 
1203 BL00514G 15.98 9.000e-15 
1289-1319 BL00514D 15.35 
6.936e-12 1207-1220 BL00514F 
H.65 4.288e-10 1253-1268 

1343 


223 


BL00325 


Actm-deporymerizing proteins. . 


BL00325B 21.66 1.000e-40 93- 
139 BL00325A 24.83 9.333e-24 
61-93 


224 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 L450M0 231-244 


225 


PF01329 


Pterin 4 alpha carbinoiamine dhydratase. 


PF01329B 18.52 1.692e-18 67-92 


228 


BL00211 


ABC transporters family proteins. 


BL00211B 13.37 6.250e-18 1033- 
1065 BL0Q211B 13.37 8.875e-18 
2045-2077 BL00211A 1223 
1.900e-09 931-943 


230 


PR00761 


BINDIN PRECURSOR SIGNATURE 


PR00761A 5.81 9366e-09275- 
292 


231 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 3.500e-I0 54-69 


232 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 1.978e-10 109- 
160 BL00412D 16.54 4.122e-09 
133-184 


233 


BL01210 


Caveolins proteins. 


BL01210B 13.92 8.129e-09 106- 
156 


236 


BL00939 


Ribosomal protein Lie proteins. 


BL00939F 17.27 5.393e-09 861- 
891 


238 


BL01252 


Endogenous opioids neuropeptides 
precursors proteins. 


BL01252D 18.25 3.57le-28 205- 
233 BL01252B 19.09 5.034e-27 
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37-67 BL01252C 18.10 1.621&-21 
164-190 BL01252A 14.22 7.107e- 
18 14-34 


239 


BL00302 


Eukaryotic initiation factor 5A hypusine 
proteins. 


BL00302 14.81 1.000e-40 25-79 


240 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 8.851e-13 26-49 


241 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR L 


PD02929A 28.27 4.529e-09 235- 
289 


243 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.527e-25 11-50 


244 


BL01270 


Band 7 protein family proteins. 


BL01270C 16.91 6.745&-17 115- 
144 BL01270B 18.74 6.857e-17 
76-115 BL01270E 13.03 6.016e- 
15 182-211 BL01270D 20.87 
9.160e-13 144-182 


245 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 6305e-12 253- 
308 FF00791B 28.49 1.909e-ll 
427-482 PF00791B 28.49 2.651e- 
09179-234 PF00791B 28.49 
3.890e-09 112-167 


246 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 2.500e-13 277-290 
PD00066 13.92 9.143e-12 193-206 
PD00066 13.92 5.304e-ll 165-178 
PD00066 13.92 6.478e-ll 249-262 
PD00066 13.92 3.423e-l 0221-234 


247 


BL00406 


Actins proteins. 


BL00406D 12.58 6.400e-20 465- 
520 BL00406B5.474.857e-14 
249-304 BL00406E8.441.000e- 
11 522-572 BL00406C6.75 
5.449e-l 1313-368 


248 


BL00951 


ER hunen protein retaining receptor 
proteins. 


BL00951C 19.35 1.000e-40 112- 
161 BL00951A 15.10 7.750e-39 
21-57 BL00951D 13.: t: ^d0C:-3& 
161-196 -.0095 IB 14.23 3.100e- 
31 57-88 


252 


BL01113 


Clq dunlin proteins. 


BL01113A 17.99 9.129e-15 200- 
227 BL01113A 17.99 4.81 8e-14 
194-221 BL01113A 17.99 7.81 8e- 
14 182-209 BL01113A 17.99 
1.730e-13 185-212 BL01113A 
17.99 6.595e-13 191-218 
BL01 1 13A 17.99 6.077e-12 203- 
230 BL01113A 17.99 9.182e-ll 
179-206 BL01113A 17.99 2.532e- 
10176-203 BL01113A 17.99 
9.043e-10 218-245 BL01113A 
17.99 9.426e-10 209-236 
BL01113A 17.99 4.1 15e-09 137- 
164 


257 


BL00845 


CAP-GIy domain proteins. 


BL00845 16.43 1.837e-21 466-491 


259 


PR00248 


METABOTROPIC GLUTAMATE 
GPCR SIGNATURE 


PR00248G 12.67 2.688e-09 53-78 




BLUUo/o 


A V \ i JTp\\ — ^ x * • « • 

irp-Asp (WD) repeat protems proteins. 


BL00678 9.67 3.400e-10 441-452 
BL00678 9.67 5.800e-10481-492 
BL00678 9.67 8.800e-lO 358-369 


261 


BL00678 


Tip-Asp (WD) repeat protems proteins. 


BL00678 9.67 3.400e*l 0415-426 
BL00678 9.67 5.800e-10 455-466 
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BL00678 9.67 8.800e-10 332-343 


262 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 468-479 
BL00678 9.67 5.800e-10 508-519 . 
BL00678 9.67 8.800e-10 385-396 


263 


BL50002 


Src homology 3 (SID) domain proteins 
profile. 


BL50002B 15.18 2.200e-10415- 
429 


264 


BL00049 


Ribosomai protein L14 proteins. 


BL00049C 17.38 3.040e-12 94- 
130 


265 


PD01469 


GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 


PD01469 20.59 2.091e-14 438-470 


266 


PD01469 


GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 


PD01469 20.59 2.09 le-1 4 279-311 


267 


BL00567 


Phosphoribulokinase proteins. 


BL00567A 10.66 1.161e-1236-55 


269 * 


BL00049 


Ribosomai protein L14 proteins. 


BL00049C 17.38 2.688e-28 92- 
128 BL00049B 18.42 6.806e-24 
54-86 BL00049A 13.86 8333e-19 
19-42 BL00049D 13.47 5/765e-12 
129-140 


272 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01115A 1022 9.735e-12 14-58 


273 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A431 1.911e-09819- 
832 


275 


PR00179 


LIPOCALIN SIGNATURE 


PR00179B 9.56 2.895e-13 124- 
137 PR00179A 13.78 3.250e-ll 
36-49 PR00179C 19.02 6.040e-ll 
154-170 


276 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 8364e-17 22-44 
PR00449C 17.27 1.000e-13 62-85 
PR00449E 13.50 4.000e-12 172- 
195 PR00449B 1434 5.680e-10 
45-62 


277 

iri 


BL00140 


Ubiquitin carboxyl-terminal hydrolase 
family I cysteine activ. 


BL00140D 22.64 1.000e-40 161- 
205 BL00140C11.80 9.053e-30 
79-104 BL00140A 15.96 9.400e- 
28 5-35 BL00140B 12.29 4.6 t9e- 

17 !7-y} < 4 


PD02712 


ELEMENT TRANSPOSA^H FOR 
TRANSPOSON TRANSPOSABLE. 


PD02712A23.i v : J «.0i5^09 47-83 < 


279 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 1 .474c- }? 1 00-1 1 1 


282 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4.767fc-il 864- 
898 


283 


BL00048 


Protamine PI proteins. 


BL00048 6.39 9.550e-09 56-83 


286 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 1.878e-l 1 36-54 


287 


PR00310 


ANTIPROLIFERATIVE PROTEIN 
BTG1 FAMILY SIGNATURE 


PR00310B 10.59 423 le-17 29-59 
PR00310D 9.10 6.679e-16 89-119 


289 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.000e-36 37-76 


293 


BL00979 


G-protein coupled receptors family 3 
proteins. 


BL00979L 20.63 3.800e-12 111- 
152 


295 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e-16 195-229 


296 


BL01064 


Pyridoxamine 5'-phosphate oxidase 
proteins. 


BL01064A 27.84 8.313e-28 77- 
129 BL01064C 1522 7.1 36V25 
202-235 


297 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 2.929e-13 37-56 
BL00030B 7.03 1.900e-ll 167- 
177 BL00030A 1439 2.000e-10 
128-147 
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298 


BL01183 


ubiE/COQ5 methyltransferase family 
proteins. 


BL01183B 21.31 6.660e-12 143- 
188 




tjt AIOTO 

IJLU1279 


Protein-L-isoaspartate(D- aspartate) 0- 
rnethyltransf erase signa. 


BL01279A 24.27 5.862e-l 1 57- 
105 


301 


BL00191 


Cytochrome b5 family, heme-binding 
domain proteins. 


BL00191K 17.38 4.95 le-27 184- 
228 BL0019I J 1 137 6.447e-17 
128-150 


302 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 3.893e-16 33-67 


306 


PF01140 


Matrix protein (MA), pi 5. 


PF01140D 15.54 2,988e-09 416- 
451 


307 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 4.818e-21 59-81 
PR00245C 7.84 5.I54e-20 238- 
254 PR00245D 10.47 4.000e» 15 
274-286 PR00245B 10.38 8.200e- 
15 177-192 PR00245E 12.40 
5.714e-12 291-306 


309 


BL00203 


Vertebrate metallothionems proteins. 


BL00203 13.94 2.245e-10 612-658 


310 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 7.632e-23 1 19- 
159 BL00237C 23.19 3.864e- 15 
251-278 BL00237D 11.23 3.739e- 
12 312-329 


311 


BL00380 


Rhodanese proteins. 


BL00380D 15.90 8.200e-28 110- 
136 BL00380G 11.26 5.800e-16 
267-280 BLOO380B 14.77 7.000e- 
14 49-62 BLOO380F 9.76 5.886e- 
13 203-214 BL00380C 15.67 
7.387e-13 82-98 BL00380E 12.44 
7.000e-ll 181-193 BL00380A 
10.48 1.000e-09 10-20 


312 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. 


BL00227B 19.29 1.00Oe-40 50- 
105 BL00227C 25.48 1.000e-40 
111-163 BL00227D 18.46 l.OOOe- 
40 220-274 BL00227F 21.16 
J. 000e-40 372-426 BL00227A 
21.55 3*535- 39 1-35 BL00T73: v 
24.15 8.500e-34 324-359 * 


327 


BL00232 


Cadherins extracellular repeat pre veins 
domain proteins. 


BL00232B 32.79 7362e-21 225- 
273 BL00232B 32.79 2.588e-17 
435-483 BL00232B 32.79 6.301e- 
15 116-164 BL00232B 32.79 
6.769e-13 330-378 BL00232C 
10.65 9.341e-12 223-241 
BL00232C 10.65 5.696e-ll 328- 
346 BL00232C 10.65 3.942&-10 
433-451 


329 


PD02749 


TRANSCRIPTION PROTEIN FACTOR 
BTF3 REGULATION NUCL. 


PD02749B 12.75 2.241e-37 35-71 
PD02749C 13.96 4.892e-28 87- 
121 PD02749A9.56 6.000e»152- 
15 


330 


PR00391 


PHOSPHAl 1DYLINOS1TOL 
TRANSFER PROTEIN SIGNATURE 


PR00391E 12.50 7.785e-15 21 1- 
231 PR00391B 839 1.000e-13 
83-104 PR00391D 1221 9.328e- 
13 191-207 PR00391A7.83 
5390e-il 16-36 


332 


BL01030 


J\r%f \ puiyxuciases iVl / 1 j JVQ SUDUniXS 

proteins. 


TIT ftlftIA AA 1 Hi o~ ni on io« 
dLa)1\)j\j L5J¥\ l.oloe-Z? 07-123 


337 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.929e-32 6-45 


340 


PD02711 


SYNTHASE 


PD02711B 1426 1.973e-20 944- 
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PHOSPHORIBOSYLFORMYLGLY. 


968 


343 


BL00223 


Annexins repeat proteins domain 
proteins. 


BL00223C 24.79 1.000e-40 245- 
300 BL00223B 28.47 8.714e-38 
168-218 BL00223A 15.59 8.250e- 
27 98-132 BL0Q223A 15.59 
8.750e-27 26-60 BL00223C 24.79 
9.438e-16 13-68 BL00223C 24.79 
2.735e-15 85-140 BL00223A 
15.59 2.253e-ll 258-292 


346 


PR00345 


STATHMIN FAMILY SIGNATURE 


PR00345B 7.12 2.800e-28 81-1 10 
PR00345E 8.54 7.652e-28 158- 
183 PR00345C4.54 9.100e-28 
110-134 PR00345D 10.97 1.964e- 
24 134-158 PR00345A 13.46 
5.645e-16 52-71 


347 


BL00586 


Ribosomal protein LI 6 proteins. 


BL00586B 17.00 3.215e-15 184- 
221 


34o 


PK00388 


3 ,5 -CYCLIC NUCLEOTIDE CLASS II 
PHOSPHODIESTERASE SIGNATURE 


PR00388A 10.45 2.778e-09 86- 
105 


351 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.118e-ll 160-173 
BL00018 7.41 2.350e-10 244-257 


354 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 1.947e-09 256-267 


358 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 3278e-09 175- 
195 DM01206B 10.69 6.696e-09 
183-203 DM01206B 10.69 
8.633e-09 132-152 DM01206B 
10.69 8.861e-09 181-201 
DM01206B 10.69 9.316e-09 177- 
197 


361 


PD01498 


OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 


PD01498C 24.90 6.880e-14 219- 
263 


362 


PD01498 


OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 


PD01498C 24.90 6.880e-14 219- 
263 


365 


BL00178 


Aminoacyl-transfer RNA synthetases 

cbi: 1 r* f eins. . 


BL00178B 7.11 l.OOOe-U 589- 
6C3- BLO0I7SA 1423 Cu-lje-'CO 
46-56 


366 


BL00523 


Sulfh&ses proteins. 


BL00523E 1927 1.000e-23 318- 
348 BL00523A 13.36 5.500e-16 
30-47 BL00523B 8.64 1.964e-13 
78-90 BL00523C 12.64 9.625e-13 
129-140 BL00523G 9.46 5.500e- 
10 506-516 


369 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 4.81 8e-09 21 -52 


370 


BL0O880 


Acyl-CoA-binding protein. 


BL0088O 17.52 1.000e-40 75-125 


371 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.000e-23 276- 
307 BL00107B 13.31 1.692e-12 
342-358 


372 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 6.602e-l 1 326- 
347 PR0021 IB 0.86 6.106e-10 
320-341 PR00211B0.86 3.167e- 
09 333-354 


373 


BL00279 


Membrane attack complex components / 
perforin proteins. 


BL00279E 37.1 1 9349e-10 749- 
797 


375 


PDA 1066 


PROTF.TN 7TNP TTMP 

FINGER METAL-BINDING NU. 


rU\j 1UOO 19.43 l.Z31e-33 10-49 


377 


- PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.563e-28 10-49 


379 


BL00598 


Chromo domain proteins. 


BL00598 14.45 5.781e-J6 3-25 
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380 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 11.28 8.941e-09 864- 
878 


383 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 1128 8.941e-09 864- 
878 


387 


BL01060 


Flagella transport protein fliP family 
proteins. 


BL01060A 15.65 1.535e-09 131- 
174 


388 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B4.88 6.318e-ll 1009- 
1028 


389 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837B 11.64 l.OOOe-10469- 
483 


391 


BL00240 


Receptor tyrosine kinase class m 
proteins. 


BL00240B 24.70 7.907e-10 1 18- 
142 


392 


PR00014 


FIBRONECITN TYPE IE REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 691- 
706 


393 


PR00014 


FffiRONECTIN TYPE ID REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 706- 
721 


394 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 3.368e-15 47-60 
BL01209 9.31 5.500e-13 92-105 


395 


BL00634 


Ribosomal protein L30 proteins. 


BL00634 34.38 4.090e-13 70-121 


396 


BL01013 


Oxysterol-binding protein family 
proteins. 


BL01013D 26.81 8.000e-26 358- 
402 BL01013A 25.14 7.231e-21 
45-81 BL01013C 9.97 1.000e-13 
132-142 BL01013B 11.33 l.OOOe- 
11 110-121 


397 


BL00930 


Peripherin / rom-1 proteins. 


BL00930E 17.80 1.000e-40 56-92 
BL00930D 9.12 4.632e-37 12-56 
BL00930F 16.91 2.800e-36 92- 
133 


400 


PR00780 


LEUSERPIN 2 SIGNATURE 


PR00780B 4.89 4.491e-09 262- 
285 


401 


PR00819 


CBXX/CFQX SUPERFAMELY 
SIGNATURE 


PR00819B 10.83 7.158e-ll 4-20 




BLOUiSl < 


^adopeptidase Clp serine «>iOieins. 


194 BL0038IA 15.48 ^286e-22 
74-111 BL0038;3 21.42 8.326e- 
14 78-130 


405 


iiL01105 


Ribosomal protein L35Ae proteins. 


BL01105A 17.37 1.000e-404-49 
BL01105B 12.95 1.000e-4068- 
108 


406 


BL00344 


GATA-type zinc finger domain proteins. 


BL00344 17.99 7.000e-12 814-852 


407 


PR00211 


GLUTELIN SIGNATURE 


PR0021 IB 0.86 9.750e-09 73-94 


409 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A2.51 4321e-099-22 


410 


BL00762 


WHEF-TRS domain proteins. 


BL00762A 23.43 l.OOOe-28752- 
789 BL00762A 23.43 4.400e-21 
903-940 BL00762A 23.43 5.415e- 
18,825-862 BL00762B 16.14 
8.759e-12 1154-1168 


412 


BL00690 


DEAH-box subfamily ATP-dependent 
helicases proteins. 


BL0069OB 13.38 5.320e-15 262- 
280 BL00690A6.87 L818e-13 
230-240 


415 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. 


BL00227B 1929 1.00Oe-4O 52- 
107 BL00227C 25.48 1.000e-40 
113-165 BL00227D 18.46 l.OOOe- 
40 222-276 BL00227F 21.16 
1.000e-40 382-436 BL00227E 
24.15 1.750e-34 326-361 
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BL00227A 24.55 1.000e-33 1-35 


416 


PF00992 


Troponin. 


PF00992A 16.67 1.71 le-09 557- 
592 


418 


BL00541 


Nuclear transition protein 1 proteins. 


BL00541 8.44 9.875e-09 256-310 


419 


BL00541 


Nuclear transition protein 1 proteins. 


BL00541 8.44 9.875e-09 197-251 


420 


PF00856 


SET domain proteins. 


PF00856A 26.14 9.074e-13 901- 
938 PF00856B 16.42 2397e-12 
951-973 


421 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 8.200e-12 33-44 


423 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.600e-30 130-169 


424 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 1.305e-17421- 
472 


,426 


PR0O988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e-12 3-21 


427 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e-12 3-21 


428 


BL00478 


LEM domain proteins. 


BL00478B 14.79 3.250e-13 115- 
130 BL00478B 14.79 9.03 6e-13 
50-65 


431 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.875e-12 464-487 


432 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 7.800e-18 316- 
357 PD00930A 25.62 9.617e-12 
125-151 PD00930B 33.72 2.521e- 
10 214-255 


433 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.649e-34 34-73 


434 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.563e-ll 56-78 


436 


PR00120 


H-KTRANSPORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e-19 705- 
722 


437 


BL00115 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


BL00115T 8.45 7.273e-29 1208- 
1242 BL00115Q 18.08 2.776&-21 
953-983 BL001 15Y 1 1.86 8.000e- 
17 1604-1650 BL00115M 19.19 
3.!30iNio 731-774 BLOOilrji * 
1434 9392e-16 463-496 
BL001 15A 15.44 7.414e-15 43-82 
BL001 15R 6.50 6.128e-14 983- 
1010 BL00115J 16.71 9.289e-14 
591-617 BL00115I833 4336e- 
13 535-590 BL00115L 12.25 
5.939e-13 662-694 BL00115G 
11.65 6.01 le-13 435-463 
BL00115K 15.03 3.417e-10617- 
659 BL00115O 16.76 5.805e-10 
863-913 BL00115Pll.547.538e- 
10 913-953 BL00115S 18.24 
7.968e-10 1010-1052 BL00115U 
1034 4.475e-09 1242-1265 


438 


PF00628 


PHD-finger. 


PF00628 15.84 4.536e-10 219-234 


440 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.351e-34 10-49 


441 


PR00309 


ARRESTTN SIGNATURE 


PR00309A 9.68 5.250e-24 32-55 
PR00309D 7.09 4.938e-23 290- 
309 PR00309B 7.812.800e-21 
69-88 PR00309C822 1.621e-19 
165-183 PR00309E9.82 9.438e- 
15 374-389 


442 


BL00600 


Aminotransferases class-IE pyridoxal- 


BL00600B 19.60 7.324e-14 103- 
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phosphate attachment si. 


129 BL00600G 12.43 2.125e-12 
306-325 BL00600F8.77 8.105e- 
12271-284 BL00600E 16.43 
3.167e-l 1228-257 BL00600D 
8.71 8.650e-09 207-221 


443 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 1 1.93 3.160e-18 69-87™ 


444 


BL00349 


CTF/NF-I proteins. 


BL00349A 10.07 1.000e-40 8-54 
BL00349C9.33 1.000e-40 82-125 
BL00349E 10.79 1.000e-40 152- 
195 BL00349F 11.81 1.000e-40 
213-255 BL00349H15.707.387e- 
36 361-399 BL00349B 10.51 
2J227e~34 54-82 BL00349D 11.70 
9.100e-34 125-152 BL00349G 
19.72 5.781e-30 323-356 


445 


BL00154 


E1-E2 ATPases phosphorylation site 
proteins. 


BL00154F 8.23 8.941e-21 271- 
295 BL00154E 20.37 2.620e-15 
124-165 


448 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.882e-ll 82-115 
DM00215 19.43 6.492e-09 87-120 


451 


BL01283 


T-box domain proteins. 


BL01283A 24.15 3.100e-40 112- - 
160 BL01283D 1 1 .70 6.000e-39 
253-286 BL01283B 23.17 6.538e- 
38 170-212 BL01283C 13.05 
7.750e-19 222-236 


452 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 2.579e-l 1 3-26 


453 


PR00162 


RIESKE 2FE-2S SUBUNIT 
SIGNATURE 


PR00162B 12.77 7.429e-17215- 
228 PR00162A9.35 2.324e-14 
193-205 PR00162C8.10 7.120e- 
14227-240 


454 


PD01066 


PROTEIN ZINC FINGER ZINC- 

f v ? T? a me;:\l-btnt>»ng nu. 


PD01066 19.43 7.0O0e-30 87-126 


456 


BL00027 


^otneci^ox* domain proteins. 


BLOt.v : / 2SA3 9323e-18 1149- *' 
1192 


457 


PD01066 


PKOl nIN ZINC FINGER ZINO 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.737e-24 16-55 


459 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290A 20.89 1 J29e-14 154- 
177 BL00290B 13.17 9.000e- 12 
214-232 


460 


PR00413 


HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413F 14.91 7.333e-ll 193- 
214 PR00413E 15.78 5.714e-09 
175-192 


463 


PR00759 


BASIC PROTEASE (KUNITZ-TYPB) 
INHIBITOR FAMILY SIGNATURE 


PR00759B 1 1.26 8.385e-09 74-85 


466 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15 33 4^00e-1930O- 
330 


467 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19 300- 
330 


469 


PR00153 


CYCLOPHUJN PEPTTDYL-PROLYL 
CIS-TRANS ISOMERASE 
SIGNATURE 


PR00153D 11.99 3.250e-15 510- 
523 PR00153C 11.01 4.682e-14 
495-511 PR00153E9.10 8.548e- 
14 523-539 PR00153B 11.57 
1.720e-13 452-465 


470 


BL00491 


Aminopeptidase P and proline 
dipeptidase proteins. 


BL00491C 12.15 3.912e-09 557- 
572 


471 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 


PD00289 9.97 1.000e-14 1482- 
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PRESYNA. 


1496 PD00289 9.97 8.650M1 
1122-1136 


474 


BL50040 


Elongation factor l gamma chain profile. 


BL50040D 17.41 1.000e-40279- 
329 BL50040E 18.79 1.000e-40 

OOO T5T C A A A AT? lO Aft C 1 0A— 

333-388 BLSI/U4U1* 18.99 5.320e- 

/IA 1AA_/IOtt TIT ^Aft/lfi/^ iCO 

3.739e»38 141-184 BL50040B 
13.65 7.000e-30 59-85 BL50040A 
12.98 1.450e-14 10-22 


475 


BL01144 


Ribosomal protein L3 1 e proteins. 


BL01 144 25.07 1.000e-40 22-74 


A1H 

4/0 


T1T> AAA AT 


COMPLEMENT CIQ DOMAIN 
SIGNATURE 


PR00007C 15.60 2.421e-21 589- 
611 PR00007B 14.16 3.500e-21 

CAA CCA TVD AAA AT A 1 A lO £ OfVT. 

544-564 PRDUOO/A 19.33 6.8970- 
20 517-544 PRO00O7D9.64 
o.j /ie-iz ozj-Oj4 


477 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 5.846e-10 170- 
189 


479 


DM01970 


0kwZK632.12YDR313C 
ENDOSOMAL ID. 


DM01970B 8.60 9.500e-17 967- 
980 


480 


PR00868 


DNA-POLYMERASE FAMILY A (POL 
I) SIGNATURE 


PR00868C 13.76 5.688e-17284- 
308 PR00868A 16.33 3.186e-13 
224-2A1 PR00868H 12.51 3.388e- 
13 431-448 PR008681 10.87 
7.93 8e- 11 462-476 PR00868E 
13.19 1.608e-10 340-366 


451 


TJT AAAT7 

oJLUUUZ/ 


'Homeobox* domain proteins. 


Of AAAOT /II A 1 OO-. *V» n/T 

BL00027 26.43 9.182e-22 53-96 


482 


BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061B 25.79 3.647e-2l 188- 
226 


A OO 

483 


BL50002 


Src homology 3 (SIB) domain proteins 
profile. 


BL50002A 14.19 1.750e-12 1032- 
1051 


485 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 9.625e-10 760- 
776 PF00023A 16.03 3.571e-09 
715-731 


486 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 9.262e 20 103- 

11.: ppcrx ;;;/74 9.o>e-o; 

201-236 


4*7 


PR00370 


FLAVIN-CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370G ^>.45 3.769e-28 471- 
493 PR00370£ }?91 l.OOOe-24 
27-46 PR00370C 13.72 4.000e-21 
140-157 PR00370E 1 1.96 9.229e- 
21 320-339 PR00370D 16.33 
1.750e-20 185-204 PR00370F 






fiT YP-OPPOTFTM MA TOP "FWVT7T fYPP 

PROBABLE U3. 




492 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 5.050e-09 45-57 


493 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 5.050e-O9 45-57 


AQA 


T>T A AO 1 1 


ABC transporters family proteins. 


TIT AA1 "1 1 A 1 ^ OO £ Af Aa AA CO *TA 

Jb5L0uz 1 1 A 12.23 5.050e-09 58-70 


495 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.786e-12 509-552 
BL00027 26.43 9.143e-12 319-362 
BL00027 26.43 2.600e-l 1 627-670 
BL00027 26.43 3.625e-10 779-822 


497 


BL00107 


Protein kinases ATP-binding region 

twatpitiq 

ytlKJ LC1JL13. 


BL00107A 18 3 9 5.800e-22214- 

04.*! T4T nniftTR 11 0.1 1 AAAa 11 

281-297 BL00107A 18.39 3.520e- 
13 583-614 BL00107B 13.31 
8.615e-12 652-668 


499 


BL00383 


Tyrosine specific protein phosphatases 


BL00383E 10.35 1.000e-14 1902- 
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1913 BL00383D 11.92 3,077e-14 
1862-1875 BL00383A 13.34 
5.500e-14 1730-1745 BL00383C 
10.10 2.000e-13 1785-1796 
BL00383F 15.51 9.069e-12 1940- 
1956 BL00383B7.61 1.692e-ll 
1755-1764 


501 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11:36 1.360e-09 136- 
150 PR00019A 11.19 1.667e-09 
91-105 PR00019Bll.36 4.600e- 
09 160-174 


503 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 1.000e-40367- 
414 BL00226B 23.86 6.143e-27 
195-243 BL00226A 12.77 7.840e- 
14 96-111 BL00226C 1323 
2.600e-13 309-340 BL00226C 
13.23 6. 143e- 12 266-297 
BL00226B 23.86 1209e-09 146- 
194 


DVD 


PDQ2407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT phosphoglycer. 


PD02407F 7.61 6.739e-09 916- 
930 




rTlH>632 


HECT-domain (ubiquitm-transferase). 


PF00632C 20.66 9.830e-l9 99 1- 
1023 PF00632B 18.45 1.155e-ll 
940-968 


507 


BL01082 


Ribosomal protein L7Ae proteins. 


BL01082 20.37 4273e-20 76-116 


508 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 493-504 


509 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 473-484 


510 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 4.774e-U 567- 
582 PR00320B 12.19 5.886e-10 
763-778. PR00320C 13.01 6.760e- 
10 567r582 PR00320A 16.74 
7.618e-10 846-861 PR00320A 
16.74 3.415e-09 763-778 
PR00320A 16.74 6268e-09 ^67- 




BLiHH79 


Phorbo* dsteia # Jiacylglyceroi binding 
domain proteins. 


BL00479C 1* b^50e-12 170: 

153 ; „*- 


512 


BL50058 


G-protein ga^i :a subunit profile. 


BL50058 2723 7.494e-09 10-58 






Somatomedin B domain proteins. 


£5LUl)524A y.03 o.^ZDe- 14 oO-lOl 


515 


BL00041 


Bacterial regulatory proteins, araC family 
proteins. 


BL00041 23.99 1.964©- 19 492-524 


516 


PD00066 


PROTEIN ZINC-FINGER METAL- 
DiJNJUl. 


PD00066 13.92 8.500e-13 391-404 


517 


BL00415 


Synapsins proteins. 


BL00415E 4.82 9291e-09 959- 


518 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.471e-12 126- 
145 


My 




Immunoglobulins and major 
histocompatibility complex proteins. 


nj nnonnu H 1 *7 >1 'reft- AO jI*7 £C 

DiAJUzyUrJ 13.1/4. /DUe-09 47-65 


coo 




r*T) CI A CO VTA ATM7MTXTET CDC/TCIY"' 

DNA METHYLTRANSFERASE 

olvJiNA 1 UJKJB 


rKUUDlDA 14. ID /.l^oe-UV 304- 

381 


525 


BL00312 


Gfycopborin A proteins. 


BL00312B 9.22 5.781e-10 891- 
920 






FINGER METAL-BINDING NU. 




529 


PR00254 


NICOTINIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 


PR00254D 15.50 4.000e-17 131- 
150 PR00254A 1123 4.706e-14 
61-78 FRD0254C 11.36 4.000e-12 
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RESULTS* 








113-126 PR00254B 12.97 1.486e~ 
1195-110 


531 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 1427 6.870<M6 787- 
810 


532 


PR00193 


MYOSIN HEAVY CHAIN 
SICjNA I UKE 


PR00193D 14.36 3.143e-34 447- 
476 PR00193C 12.60 7.632e-32 
216-244 PR00193B 11.697.750e- 
29 1 67-193 PR00193A 15.41 
2.588e-22 111-131 PR00193E 
i y .4 / z.zuue-2 i du i o j u 




rjL/vzo /u 


ivblJDir 1 UK. IN 1 IlJvLllUlSJLN- 1 

PRECURSOR. 


ru\)2.ol\JD lo.oJ j.jyo&-[)y 348- 

381 






HOMOLOGY DOMAIN SIGNATURE 


JPRUUdS^D 15.87 z.452e-10 465- 
484 


536 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.684e-24 164-207 


538 


PR00239 


MOLLUSCAN RHODOPSLN C- 
TERMLNAL TAIL SIGNATURE 


PR00239E 1.58 2.739e-09 225- 
237 


539 


BL00406 


Actios proteins. 


BL00406C 6.75 1.000e-40 157- 
212 BL00406B 5.476.143e-37 
90-145 BL00406D 12 J8 4.600e- 
36 291-346 BL00406E8.44 
2.200e-33 364-414 BL00406A 
9.95 4.441e-23 7-42 


540 


PR00456 


RIBOSOMAL PROTEIN P2 
MvjW A 1 UKc 


PR00456E 3.06 9.625e-10 44-59 


541 


PR0O456 


RIBOSOMAL PROTEIN P2 

CTPXTATITTIT? 

SIGNATURE 


PR00456E 3.06 9.625e-10 44-59 


542 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 7.857e-ll 138- 
154 


544 


PF00642 


Zinc finger C-x8-C-x5-C-x3-H type (and 
similar). 


PF00642 1 1.59 9.082e-10 838-849 


546 


BL00383 


Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10.35 4.115e-10 104- 
115 


547 


BL0122f 


Hydroxymemylghitaryl-coenzyme A 
- 7 , ^ase proteins, 


W01226A 13.79 l.OOOe-40 50-8 > 

: ^6C a3.^ - :H7- 
167 BL01226D i... : A 1.0ooJ-40 
174-210 BL01226E S.741.000e- 
40212-253 BL01226H V. 74 
1.000e-40 386-434 BL0122SI 
25.06 1.000e-40 460-508 
BL01226G 15.76 3.483e-32 292- 
32 1 BL01226B 13.35 1.818e-31 
95-127 BL01226F9.78 8.714e-23 
253-271 


549 


BL00964 


Syndecans proteins. 


BL00964B 12.05 2.426e-10 1246- 

1 1 0ft 

1289 


DDL 


TWA n 1 om 


L KW riiNUcK bMvJA oMLY 
YDR096W. 


DM01930E 15.41 L367e-37 170- 
215 DM01930F 14.16 8.232e-28 
20/-JU3 DM01930B 19.86 
9.163e-10 37-71 


552 


BL00195 


Glutaredoxin proteins. 


BL00195B 15.31 7.158e-09 9-29 


554 


BL0O383 


Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10 3 5 2.756e-12 436- 
447 


555 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 7.612e-ll 122- 

107-121 PR00403B 12.19 2.068e- 
09 76-91 


558 


PR00380 


KINESLN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 2.714e-26 76-98 
PR00380D 9.93 3.000e-24 275- 
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zy/ rKAJUj oliu J J. lo j. 154e-20 
226-245 PR00380B 12.64 9.400e- 
20 195-213 


559 . 


BL00518 


Zinc fin&er tvne fiiTMfi finoWi 
xjluv linger, v-/jnv->T i/pc ^iviiNU nngcij, 

proteins. 




561 


PD01795 


PROTEIN AMINOPEPHDASE 

■T IVJ^vUIVOUB. n X i^X\.WJL(/\OX^ kjJ.VJlN.rt.. 


PD01795B 11.562333e-I2 159- 
tto Dnnno^A 1AO*7 I aaa a aa 

135-144 


562 


PD01795 


-i iv v«/ 1 j-<iii rt-ivjj_iN \jr&r x ii//\oJ3 

PRECURSOR HYDROLASE SIGNA. 


JrJLJOl /50Jt> 11.30 z.j Joe- 12 110- 
123 PD01795A 10.27 1.000e-09 
oo-yj 


563 


BL00018 


EF-hand calcium-binding domain 


BL00018 7.41 1391e-09 41-54 


565 


BL00348 


p53 tumor antigen proteins. 


BL00348F 23.19 4.143e-09 188- 
231 


567 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301B 5.49 4.115e-09284- 
295 




Jrruvoju 


Histone deacetylase family. 


PF00850E 8.88 6.553e-2l 756-782 
PF00850D 14.76 1.519e-16 722- 

HAH T5T7AAO C AT? 1 4T Tft 111 O ~ i i 

74o Fr 0085QF 15.70 LI 18e-ll 
794-827 PF00850G 22.75 8375e- 
11 o33-o/j 


570 




PRESYNA. 


rU\}\J£6y y.yj 4.yo0e-10 137-151 


571 




linger, i^ons^t type v jvuno linger ) y 
proteins. 


DT AAC lO 11 Ol O OAA« 1 1 A A co 

dLUUdIS o.800e-li 44-53 


. 573 


BL00299 


Ubiquitin domain proteins. 


BL0029928.84l.l23e-ll 123-175 


574 


PF01140 


Matrix protein (MA), pl5. 


PFOl 140D 15.54 3.700e-10 986- 
1021 


576 


BL00284 


Serpins proteins. 


BL00284C 28.56 5.200e-26 200- 
242 BL00284A 15.64 4.913e-18 
71-95 BL00284B 17.99 7.261e-15 
173-194 BL00284D 16.34 5.846e- 
13 306-333 BI 00284E 19.15 


579 


PD0 :5 


PROTEIN ZINC FINv 7> Zirw- " 
FINGER MBTAL-BINL ;G NU. 


PD01066 19.43 6.553e-29 15 


580 


DJLr J UUU I 


ore noiaoiogy Z ^oriZj uomi*.C vrntems 
profile. 


*DT CAAA1 D IT At\ A | s\ 4 a< a 

BL5UUU1B 17.40 4.500e- 12 1010- 
1031 


581 




PROTPT>J ftTPA^TF nONyf ATM 

activation. 


VAAAQQAD 1 1 TO 1 1 OA** TO /AO 

JrJJUU93UB 33.72 3.1o9e-22 608- 
649 PD00930A 25.62 6.806e-17 

DU3- j3 I 


584 


BL00612 


^VctAAtiM v Hn Hninntn nrntptnc 
vyaLCUiicvUll UUllJttlll piuiciiib. 


riJLuUoizo 1 1 .3 0 z.U34e-l 1 93- 


585 


DM0155I 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 8.859e-10 102- 
122 


586 


PP00628 


PHD-finger. 


PF00628 15.84 3.455e-12 235-250 


587 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.063e-10 85*128 


SRR 

JOO 




Kj X r it Kjoxj vj 1 r-JtJUNlJiJN (j jrKU 1 JzlN 
FAMILY SIGNATURE 


PR00326A 8.75 7.525 e- 16 227- 
248 PR00326C 9.79 6.760e-15 
Z/O'^yZ rKUU3zolJ 19.09 o.o57e- 
13 293-312 PR00326B 16.74 

O OOOo 11 Oytfi o«*y 


589 


BL00422 


Granins proteins. 


BL00422A 28.34 7.429e-09 2349- 
2378 


590 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e-10 295- 
339 


591 


BL00128 


Alpha-lactalbuinin / lysozyme C proteins. 


BL00128A 20.76 3.423e-13 35-65 
BL00128C 19.34 2.980e-ll 110- 
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132 


596 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 3.136e-09 31-46 


597 


DM00547 


1 kw CHROMO BROMODOMA1N 
SHADOW GLOBAL. 


DM00547C 1730 1.667e-19 207- 
229 DM00547E 13.94 6.200e-18 
319-342 DM00547B 1128 

1 Artrttt IT 1*70 1 fV> TVfcVTAACjI'Tr* 

11.60 9.250e-13 289-303 

lyJVLUUD** IT LdJ\5 0. /Z/&-IZ O/Jr- 

726 DM00547A 12.38 4.818e-ll 


600 


PD01066 


PROTEIN ZINC FINGER ZINC- 

KTMfirTO MPT AT -RTNmiWft TSlT T 


PD01066 19.43 1.882e-27 13-52 


601 


BL00192 


Cytochrome b/b6 heme-ligand proteins. 


BL00192A 1 1.90 6.400e-09 390- 


602 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 2727 8.615e-09 118- 
157 




DiAIUyjO 


Ribosomal protein L35 proteins. 


T>T rtAAIilO 0*7 *>T O £1 Ca AO 1 ID 

JtSJLUUyjoii LI. 11 8.ol3e-\J9 118- 
157 


606 


PR00019 


LEUCINE-RICH REPEAT 

CTfSXT A TT TP I? 


PR00019B 1 1.36 7.300e-10 292- 
306 PR00019A 11.19 5.667e-09 
323-337 


607 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 1 1.36 7.300e-10 292- 
306 PR00019A11.195.667e-09 
323-337 


608 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 9.500e»12 168- 
183 PR00320A 16.74 2.853e-10 
60-75 PR00320A 16.74 4.706e-10 
14-29 PR00320C 13.01 5.320e-10 
60-75 PR00320C 13.01 5.680e-10 
14-29 PR00320A 16.74 6.049e-09 
217-232 PR00320B 12.19 8.875e- 
09 168-183 


610 


BL00750 


Cbaperonins TCP-1 proteins. 


BL00750B 16.17 l.OOOe-4070- 

120 bu'..-lV.2o.i:?6.?i: 

26-69 EL00750G 20.12 8.L ; >e-3i , 
431-471 BL00750F 18.40 5.i:> 
30 370-41 1 BL00750E 24.59 
8.650e-29 295-332 BL00750H 
21.44 1.000e-27 489-524 
BL00750C 25.65 5.345e-17 149- 
181 BL00750D 16.16 6.3 18e-14 
203-222 


613 


BL00766 


Tetrahydrofolate » 
dehydrogenase/cyclohydrolase proteins. 


BL00766B 24.49 1.000e-40 142- 
190 BL00766E 13.78 L000e-40 
jZZ-Jjy oLUU/OOC zj.oO S.MJUe- 

39 208-256 BL00766D 17.05 
21.48 6.063e-24 102-132 






/WlipUlLulCUv I1VI IIIUIIC TnTTlliy pi I iTEinfl T 


TIT AAO^iC lO Tfi Q lOftal A HA£~*]*< 

dxJjvZjO \Lu.o j^yoe-iu /40-/jj 


616 


BL00319 


Amyloidogenic glycoprotein extracellular 
domain proteins. 


BL00319C 17.12 9.053e-09 419- 


617 


BL00030 


Eukaryotic RNA-binding region RNP-1 
pioicins. 


BL00030A 1439 4.429e-09 44-63 


618 


BL00030 


Eukaryotic RNA -binding region RNP-1 
proteins. 


FIT .00030 A 14 30 4 4?Qe-0Q 44-fft 


620 


BL00325 


Actin-depolymerizing proteins. 


BL00325B 21.66 5.817e-16 77- 
123 


622 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 


BL00972A 11.93 5.500e-19 213- 
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family 2 proteins. 


231 BL00972D22.55Z742e-16 
501-526 BL00972B 9.45 LOOOe- 

1 1 2yf-j\)l DlAJXJy //L 10.4o 

3.160e-l 1370-385 BL00972E 
20.72 7.517e-10 526-548 


625 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NIL 


PD01066 19.43 6.333e-39 6-45 


628 


BL00039 


DEAD-box subfamily ATP-dependent 
helicases proteins. 


BL00039D 21.67 7.750e-31 478- 
524 BL00039A 18.44 2.000e-25 
198-237 BLUUU39C 15.03 1.844e- 
15 327-351 BL00039B 19.19 

5.0 Joe- 14 z4z-ZOo 


630 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12 232- 
246 


631 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e-12 290- 
304 


633 


BL00785 


5 , -nucleotidase proteins. 


BL00785C 9.45 3.625e-16 108- 
122 BL00785E 15.85 4.000e-16 
279-295 BL00785A 9.73 6.500e- 
14 29-40 BL00785B 10.65 
5 J00e-13 72-86 BL00785D 9.89 
4.000e-12 135-145 


636 


PR00832 


PAXILLIN SIGNATURE 


PR00832E 14.43 9.901e-14 85- 
108 


637 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 6.362e-13 221- 
240 


638 


PF00635 


MSP (Major sperm protein) domain 
proteins. 


PF00635B 15.84 4.900e-ll 463- 
502 


639 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 1.900e-18 85-99 
PR00860C 9.61 1.474e-14 99-109 
PR00860A 5.46 1.720e-14 63-76 


641 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 4.462e-15 271-284 
PD00066 13.92 4.462e-15 299-312 
PD00066 13.92 2.800e- 14 327-340 
rDS?066 13.52 2.30C>I438.? ??6 
PD00066 13.92 2.800e-144r^v4 
PD00066 13.92 7.000e-14 355-368 
PD00066 13.92 8.800e- 14 439-452 
PD0OO66 13.92 8.800e- 14 495-508 
PD00066 13.92 U00e-13 551-564 
PD00066 13.92 7.000e-13467-480 
PD00066 13.92 7.000e-13 523-536 
PD00066 13.92 9.500e-13 215-228 
PD00066 13.92 9.500e-13 243-256 
PD00066 13.92 9.500e-13 579-592 
PD00066 13.92 8.615e-10 607-620 
PD00066 13.92 1.600e-09 187-200 


642 


BL00961 


Ribosomal protein S28e proteins. 


BL00961B 11.24 7.429e-37 67- 

1 ftft Ti T f\f\f\£ 1 A A ftA A ft*7A.— *S£. 

100 BL00961A9.90 4.079e-2o 

A*> CC 

42-0O 


643 


BL00585 


Ribosomal protein S5 proteins. 


BL00585A 28.43 1.391e-40 103- 
155 BLUU5o5B 18.78 3.25Ue-3U 
193-230 


647 


BL0067S 


irp-Asp (WU) repeat proteins protems. 


TJT (\f\C~i O ft CI A iAA« 1 A 101 1 /y> 

BLOOo/o 9.67 9.400e-10 181-192 


fiAQ 
Krro 


rivUUo / 0 


SIGNATURE 


126 


652 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 5.94 le-27 29-68 


653 


BL00047 


Histone H4 protems. 


BL00047A 13 33 1.000e-402-41 
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dLUUU4Yd 0.5 1 I.429e-4041-74 
104 


654 


ruuiuoo 


PPHTPTM 7TMP PfKIfTFI? 7TKP-. 

FINGER METAL-BINDING NU. 


phai n^/* i o ai a i no«_oc oa^o 
ruu luoo i y.*r j *t, i uye-Z5 jiM>y 


655 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01115A 10.22 3.483e-17 19-63 


657 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12J23 8.286e-l 031-40 




T>T An IOC 

BL00125 


Serine/threonine specific protein 
phosphatases proteins. 


BL00125B 21.48 1.000e-40 89- 
135 BL00125C 19.97 1.000e-40 
153-200 BL00125D33.il l.OOOe- 
40213-268 BL00125A 14.83 
8.941e-38 47-84 


659 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 8.200e-16 492-505 
PD00066 13.92 9.308e-15 380-393 
PD00066 13.92 6.000e-13 352-365 
PD00066 13.92 7.000e-13 240-253 
PD00066 13.92 7.500e-13 268-281 
PD00066 13.92 7.500e-13 408-421 
PD00066 13.92 2.174e-Il 464-477 
PD00066 13.92 1.000e-10436-449 


660 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.189e-26 29-68 


661 


BL00795 


Invotocrin proteins. 


BL00795C 17.06 7.882e-15 193- 
238 BL00795C 17.06 3.797e- 13 
187-232 BL00795C 17.06 5.014e- 
13 188-233 BL00795C 17.06 
4.506e-12 196-241 BL00795C 
17.06 7.896e-12 191-236 
BL00795C 17.06 1.667e-ll 185- 

230 BL00795C 17.06 2.000e-ll 
198-243 BL00795C 17.06 3.778e- 
11 171-216 BL00795C 17.06 
6.1Ile-ll 197-242 BL00795C 
17.06 6.44", i lo^r^ 
PIL00795C ? 7.06 8.000e-l 1 U -v j 
234 BL00795C 17.06 8.556e-ll 1 
192-237 BL00795C 17.06 L733e- 
10 195-240 BL00795C 17.06 
2.779e-10 184-229 BL00795C 
17.06 4.035e-10 199-244 
BL00795C 17.06 5.081e-10 186- 

231 BL00795C 17.06 6.965e-10 
190-235 BL00795C 17.06 2.700e- 
09 200-245 BL00795C 17.06 
5.800e-09 175-220 BL00795C 
17.06 6.500e-09 182-227 

Dl nATOC/"* 1*7 AiC H £.f\f\r± AO OA1 

BL00795U 17.0oo.o00e-0y 201- 
246 BL00795C 17.06 6.600e-09 

O AO OylT HI AATO**r* 1 *7 A£ A <AA fl 

09 208-253 






iNucieosiue ojpnospnate Kinases proteins. 


tjt i\f\A£Q 11 OO 1 AAAa_4A 140 OA^ 
U-LUU40V ZZ.ZZ l.UUUe-4U J4V-ZU4 


663 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.41 le-11 331- 
385 


664 


BL00601 


Tryptophan pentad repeat proteins (IRF 
family) proteins. 


BL00601 A 20.29 5.500e-23 7-46 
BL00601B 20.92 3.631e-13 69-98 


665 


BL00082 


Extradiol ring-cleavage dioxygenases 
proteins. 


BL00082A 19.07 8.6 15e- 12 49-72 


666 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 


DM01537B 21.63 4.073e-37 834- 
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ID 

MA, 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






HELICASE. 


881 DM01537B 21.63 9.750e-21 
1669-1716 DM01537A 15.14 
8.650e-l 8 698-718 DM01537A 
15.14 6.766e-12 1537-1557 


00/ 


JJlVlU 1 D J / 


HELICASE. 


T"\A4A1 C1TD 0 1 dl H QO'Ja 1 Q MA 

867 DM01537B 21.63 9.750e-21 
1655-1702 DM01537A 15.14 

o rcn« IQiCQ/1 TA/I TYfcjTAI <OT A 


ooy 


t>t i>nin7 

DUJKJ l\J / 


r HJIG1I1 KXUcuCa /A. I r-uinuiiiK IctOUJU 

proteins. 


CLrvUlU/A I0.J7 O. /oOe-Z** o**?- 

880 BL00107B 133 1 6.727e-13 
916-932 


find 

O/v 




LrDiqUlUIl UUIJlalD proteins. 




671 


BL00027 


Uomeobox* domain proteins. 


BL00027 26.43 6.571e-12 432-475 


676 


PR00861 


ALPHA-LYTIC ENDOPEPTTDASE 
SERINE PROTEASE (S2A) 
SIGNATURE 


PR00861E 9.88 2.385e-09 206- 
221 


678 


BL00225 


Crystallins beta and gamma 'Greek key 1 
motif proteins. 


BL00225B 18.06 7.5 17e-24 1805- 
1840 BL00225B 18.068.297e-20 
1987-2022 BL00225B 18.06 
2 J75e-19 1896-1931 BL00225B 
18.06 8.200e-19 175-210 
BL00225B 18.06 8200e-19 1698- 
1733 BL00225B 18.06 4.808e-14 
73-108 BL00225B 18.06 4,808e- 
14 1596-1631 BL00225B 18.06 
5.500e-14 2077-2112 BL00225A 
13.82 5.829e- 12 2043-2064 
BL00225A 13.82 3.127e-09 1759- 
1780 


nick 
679 




U-rROTEIN BEIA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 4.240e-10 169- 
184 PR00320A 16.74 6.294e-10 
169-184 


680 


BL00243 


mtegrins beta chain cysteine-rich domain 

rrotsfos. 


BL002431 31.77 1.143e-ll 172- 

VA 


j" 681"' 


PR00852 


XE^DE^Y PIGMENTOSUM 
GROUP D PROTEIN SIGNATURE 


:< Xuuo^ ! 5.90 1.000e-29 612- - 
6J* PR00852E8.I4 3.769e-27 
348-:-? [ PR00852D 1138 8.875e- 
27 309-331 PR00852B 11.08 
2.800e-25 249-269 PR00852I 
17.2o 3.5U0e-25 683-704 

TJ"D AA O <OT7 11 O < C AAA- 0 A mi\ 

rKUU852r 1 1 .85 5.909e-24 379- 
398 PR00852G 16.19 4.462e-23 
468-486 PR00852C 8.81 9.143e- 
23 284-303 


682 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 1375e-35 15-63 




JoLA/uy /z 


Ubiquitin cartwxyMerminal hydrolases 
family 2 proteins. 


TIT AAIV70 A 1 1 1 CAA» OA At\ CO 

JdL0v97zA 11.93 7.5006-20 40-58 
BL00972D 22 J5 3.903e-16 300- 

120-130 BL00972E 20.72 5.500e- 
11 325-347 


687 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.273e-14 98- 
138 


050 


TIT AAIOO 


Proteasome A-type subunits proteins. 


BL00388A 23.14 l.000e-40 8-54 
TVfiWTCJiRB 11 3 R£4p.11 ftVi- 

108 BL00388D 20.71 1.000e-21 
153-184 BL00388C 18.79 8.147e- 
16 126-148 


689 


PD02796 


PROTEIN STEROL CARRIER LIPID- 


PD02796B 20.92 1.105e-15 347- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






TRAN. 


394 


691 


PD01572 


PHOTOSYSTEM n REACTION 
CENTRE T PROTEIN PHOTOS. 


PD01572 8.77 4.083e-09 1-31 


692 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 7.600&-10 488-505 




tit mm a 


v/x-ysieroi^inuing pruicin uuiiiiy 
proteins. 


563 BL01013D 26.81 8335e-23 
814-858 BL01013C 9.97 6.21 le- 
14615-625 BL01013B 11.33 


695 


PD00289 


PROTEIN SID DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 3.571e-13 164-178 
PD00289 9.97 8.650e-l 1 2147- 

37 


698 


PR00161 


NICKEL-DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e-09282- 
302 


700 


PR00749 


LYSOZYME G SIGNATURE 


PR00749F 13.63 8.636e-13 139- 
156 PR00749H8.22 3.681e-12 
173-194 PR00749B 16.54 1.419e- 
11 48-70 PR00749C 7.26 3.060e- 
1172-91 PR00749A 10.33 
4.815e-10 24-45 


703 


PR00704 


CALPAIN CYSTEINE PROTEASE (C2) 
FAMILY SIGNATURE 


PR007041 9.52 1.000e-29 476-505 
PR00704D 11.052.500e-27 132- 
158 PR00704E 12.55 5.500e-27 
162-186 PR00704F 13.61 l.OOOe- 
22 187-215 PR00704G 13.87 
1.237e-21 317-339 PR00704H 
13.38 8.138e-21 367-385 
PR00704A 14.68 2.125e-19 27-51 
PK007U4U 11. 88 1.257e-17 9o- 
113 PR00704B 17.94 1.833e-15 
72-95 


705 

fvO 


PR00859 


PROKARYOTE METALLOTHIONEIN 
SICNATIP^ 


PR00859C 7.06 2.776e-09 94-1 1 1 




Intermedin... intuit Xs proteins. . 


BL00226D 1 9. 10 ? t • » ; e-26 36!?- 
416 BL00226B 23.86 3.250e-24 
ZUd-jLdI JdLUuZZoC 15.23 o.2oye- 
21 268-299 BL00226A 12.77 

ft OfW>i* 1 A 1 fV3 1111 


707 


PR00021 


SMALL PROLINE-RICH PROTEIN 

^TfSMATITRP 


PR00Q21A 431 2.440e-102-15 


708 


BL00361 


Ribosomal protein S10 proteins. 


BL00361B 18.34 5.101e-10 82- 
105 


709 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00Q21A 4.31 2300e-10 2-15 


710 


BL00514 


Fibrinogen beta and gamma chains C- 
iciiiiinai aomain proiems. 


BL00514C 17.41 8.412e~27 160- 

1 0T HT ftft<1 AX3 1 A OQ ft qaa- i £ 

219-236 BL00514H 14.95 1.551e- 

1^317 1AO TXT fin^ 1 Afl 1 ^ Oft 

7.750e-15 284-314 BL00514D 
15.35 4.789e-10 201-214 


711 


PD00930 


PROTEIN GTPASE DOMAIN 


PD00930B 33.72 8.714e-12 49-90 


714 


BL00400 


LBP / BPI / CETP familv nrotem^ 


BL00400C 24 53 6 029e-17 158- 
202 BL00400D 23.26 2.080e-14 
222-259 BL00400A 21.59 1.600e- 
10 27-59 


715 


BL01154 


RNA polymerases L / 13 to 16 Kd 


BL01 154B 24.55 5.500e-36 40-76 
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SEQ 

irv 
11/ 

NO: 


ACCESSION 

i\rr> 


DESCRIPTION 


RESULTS* 






subunits proteins. 


BL0II54A 18.70 3.000e-22 19-40 


71 A 
1 10 


rJJUIUoo 


rKU 1 JtilJN £LNL/ JrllNOiiK ZINC- 

FINGER METAL-BINDING NU. 


rDUluoo ly.4i *J./ooe-32 10-49 


fit 




Mitochondrial energy transfer proteins. 


xlLUUZl jA IXoZ y^Uoe-14 77- 
102 BL00215A 15.82 8.412e-10 
175*200 


719 


BL00309 


Vertebrate galactoside-binding lectin 
proteins. 


BL00309C 18.65 2.241e-09 62-87 


726 


BL00687 


Aldehyde dehydrogenases glutamic acid 
proteins. 


BL00687E 25.37 7.136e-33 266- 

A 1 y" T"\T f\ f\ f\* II A ^ #\A /V AAA Ai A 

316 BL00687D 26.00 5.333e-28 
151-198 BL00687B 17.54 3.647e- 
26 39-81 BLO0687C 24.13 
6.087e-22 96-133 BL00687F 9.55 
2.500e-l 1352-363 


727 


DM01354 


kw TRANSCRIPTASE REVERSE II 
0RF2. 


DM01354N 13.17 1.000e-40 129- 
174 DM01354O 8.73 6.605e-15 
180-226 


734 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BL 


PD00301A 10.24 6.400e-09 101- 
112 


735 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024A 10.26 I.O00e-4022-69 
BL01024B8.91 1. 000e-40 86-127 
BL01024C 7.80 1.000e-40 146- 
185 BL01024D 13J22 1.000e-40 
185-222 BL01024E 11.96 l.OOOe- 
40 222-266 BL01024F9.42 

I. 000e-40 266-317 BL01024G 

II. 09 1.000e-40 317-349 
BL01024H 13.88 l.OOOe-40389- 
442 


736 


PF00913 


Trypanosoma variant surface 
glycoprotein. 


PF00913D 11.90 7.130e-10 24-51 


737 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700D 12.47 2.200e-09 82- 
101 


740 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 

i SIGNATUv^ 


T^T% A A A, A^* *1 A A 1 /AA A A f A A A 

PR00320C 1 3 .0 1 1 .600e-09 68-83 
PR00320A 'fi ,71 7 36t' M)9 f ? Z\ 


743 


PR0C;./1 


ON A 

NUOLEOTTDYLEXOTRANSFERASE 
(TOT) SIGNATURE 


* - A A A^V 4 -« A 4 A A A, A A. A /\ . * 

?iv0087 1 G 1 4.48 8.000e-09 j : ^ i 

201 1 

i 


745 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 2.286e-10 33-42 


749 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.200e-15 221- 
246 BL00215A 15.82 7.618e-14 
20-45 BL00215A 15.82 8. 85 1 e- 11 
123-148 BL00215B 10.44 9.526e- 
1169-82 BL00215B 10.44 
7300e-09 272-285 BL00215B 
10.44 8.500e-09 165-178 


731 




Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 1.000e-14370- 
389 BL50002B 15.18 2.200e- 10 
408-422 






hmu i /z proteins. 


JtsjLrUUJjio 1 1.4/ j.uove-iz jyo- 
440 


753 


PF00622 


Domain in SPIa and the RYanodine 
Receptor. 


PF00622B 21.00 4J2I4e-I4 47-69 






ABC transporters family protems. 


£>LrUUzllA IZ^J o.74ie-lU 0O-/O 


755 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 7.750e-19 392- 
415 PR00926C 16.07 5.935e-17 
253-274 PR00926D 10.53 2.059e- 
15 301-320 PRD0926E 11.70 
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ID 
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NO. 
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RESULTS* 








4.971e-15 344-363 PR00926B 
16.07 9.526e-13 210-225 
PR00926A 10.41 1.514M2 197- 
211 


756 


BL01187 


Calcium-binding EGF-like domain 
proteins pattern proteins. 


BL01 1 87A 9.98 2.125e-12 324- 

iojC nr r\i t 0*7 a n no a ion- 1 t 
330 BL0 1 1 o/A 9.98 4.7o9e-l 1 

377-389 BL01187B 12.04 3.057e- 

1U 43Jr-4 Oj 


757 


PF00651 


BTB (also known as BR-C/Ttk) domain 
.proteins. 


PF00651 15.00 4.429e-10 43-56 


758 


PR00055 


fflV TAT DOMAIN SIGNATURE 


PR00055A 8.13 8.855e-09 144- 
156 


759 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 5.304e-ll 110-123 


760 


T\T> f\f\ A A O 

PR00448 


NSF ATTACHMENT PROTEIN 
olONAl URE 


PR00448D 12.42 3.455e-27 162- 

t oaz nn r\r\A A o x. t f\ *ia i iv> 

1 86 PR00448A 10.74 1.273e-22 
37-57 PR00448B 16.01 9379e-21 

i Af\ i in t>t>AAjI ion if 1 

100-118 PR00448C 11.46 l.OQOe- 
20 129-147 


765 


BL01042 


Homoserine dehydrogenase proteins. 


BL01042A 13.29 5.909e-ll 74-95 


766 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 2.154e-18 26-46 
PR00625B 13.48 9.000e-16 57-78 


768 


BL00762 


WHEP-TRS domain proteins. 


BL00762A 23.43 8.500e-28 112- 
149 BL00762B 16.14 3.793e-12 
64-78 BL00762A 23.43 6.625e- 12 
6-43 BL00762C 15.58 4.176e-09 
459-472 BL00762D 11.15 9.667e- 
09210-220 


769 


PR00709 


AVIDIN SIGNATURE 


PR00709A 4.60 1.934e-09 1-20 


770 


PR00320 


G-PROTEM BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 1.720e-10262- 
277 PR00320A 16.74 2.853e-10 
262-277 PR00320C 13.01 4.300e- 
09 96-1 1 1 PR00320B 12.19 
5.500e-09 262-277 PR00320A 
16.74 ~ 's%70 


FRQ0019 


LEUCINE-Ri.CH Ru. * A a 
SIGNATURE 


PR00019.U ! i3x> w.714e-i2 87- 
101 PROOO^'A 11.19 1.000e-10 
90-104 


772 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6Ji^>10 110- 
159 


773 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PDQ2807C 8.91 6308e-10 155- 
204 


774 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547F 23.43 3.942e-28 943- 
990 DM00547E 13.94 9.750e-21 
652-675 DM00547B1U8 
1 .81 8e-l 8 518-532 DM00547C 
17.30 3.531e-17 546-568 
DM00547A 12.38 1273e-ll 497- 
509 DM00547D 11.60 9.20ve-ll 
622-636 


776 


PR00779 


INOSITOL 1 ,4,5-TRISPHOSPHATE- 
rSllNiJIXNvj rKUinlN KbClirlvJK 
SIGNATURE 


PR00779F 14.51 5.147e-09769- 
792 


111 


PR00779 


INOSITOL 1 ,4,5-TRISPHOSPHATE- 
RrwrvrMfr ppotptm tjpppptatj 

SIGNATURE 


PR00779F 14.51 5.147e-09 742- 


US 


PR00779 


INOSITOL 1 ,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 742- 
765 
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RESULTS* 


779 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.543e-09 6-45 


781 

/Ol 






rKUUtU^xo 1 1 .37 i.l 1 oe-x 1 654- 
672 PR00205B 1 1.39 8.588e-ll 
230-248 PR00205Bll.39 8.527e- 
10 551-569 PR00205B 1139 
4.203e-09 336-354 


783 


BL00625 


Regulator of chromosome condensation 
(RCC1) proteins. 


BL00625B 17.69 2.167e-19 193- 
227 BL00625A 16.21 5.500e-17 
199-228 BL00625B 17.69 1.885e- 

1 A 1 Afl 1 7 A TIT AA/CO *Tl 1 T £Q 
10 14U-1/4 OIAIUOZjXj 1 /.t)7 

2.770e-16 245-279 BL00625A 

luxl y. 1 lJC-JO ZJ 1-Z5U 

BL00625A 16.21 6.507e-14 146- 

17S 


785 


PF00084 


Sushi domain proteins (SCR repeat 


PF00084B 9.45 7.188e-10 595-607 

PPAAA5MQ O 4< fi 4AfWAQ 

rruwo*tD y.*? j o.*#uve-uy OjO-ooo 


786 


PF00084 


Sushi domain proteins (SCR repeat 


PF00084B 9.45 7.188e-10 595-607 


787 


BL00826 


MARCKS family proteins. 


BL00826C 7.63 6.738e-09 203- 
230 


/ oo 




V/YM WTT T PUT? AMTk U APTnU TVPT7 

A DOMAIN SIGNATURE 


rK.UU4DJA 1Z./7 l.jlUe-14 30-34 

PR00453B 14.65 8.568e- 10 75-90 


/oy 




opxnTXiTKrt? 

VjlVlNl 1 xlilNtS 

CARBAMOYLTRANSFERASE 
SIGNATURE 


T>T> f*A 1 HOD 1 >l OO C ill AA a^i 

977 


790 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030B 7.03 5.500e-ll 199- 
209 


/yi 


Di An/i 1 < 


Sy naps ins proteins. 


T>T f\f\A 1 CKT a on net A—. 1 A oni 

JdL0Q415N 4.29 9.519e-10 393- 
437 BL00415N 4.29 2.1 17e-09 

tA5 1 ylT DT A A/I 1 CKT >l OO "5 £*tO* 

IU3-14/ 13L.UU415N 4JZy 3.028e- 
0997-141 BL00415N4.29 
5.664e-09 387-431 


795 


PD01066 


PROTEIN ZINC FINGER ZINC- 

TTTWflPD V/TCTAT HTXmTXm XTT T 


PD01066 19.43 2.091e-36 105-144 


799 


PF007iI 


ATI carbo::y' ; 


PFG0731C 23.16 7.3333-35 33V 
dov rruU/3ix> J*. /.4Zye-zo 
299-336 PF0073IAJ9.326.333e- 
24268-297 


" Ovrr 


JOJLAJU 1 / w 


vyciopxLiinr*iype pcpiiuyi-proryi cis-u*ans 
isomerase signatur. 


TIT AA1 TAD OA GT 0 ATI a AO TOT 

oLUUl /Ud zu.y / o.ii/ie-uv /- 
337 


OUJ 


x>t nn£7ft 


iip-Asp repeat proteins proxems. 


TIT AA/CTft O i*T 3 /IA/L» 1 A 3T0 1QO 

BL00678 9.67 5.800e-10 418-429 

TAT /Vl^7ft 0 ft HO a- 1 ft OQ^.'XCiA 


806 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE 


PD01719A 12.89 7.571e-14290- 


807 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 9.100e-09451- 


809 


BL00107 


Protein kinases ATP-binding region 

nroteiiw 


BL00107A 18.39 4.462e-12 564- 

jy*) 


810 


PR00453 


VON WTLLEBRAND FACTOR TYPE 
A DOMAIN SIGNATt JRF 


PR00453A 12.79 lJ10e-14 36-54 

rl^UVrrJJD 1H.UJ O.JUOClU /J'7w 


814 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-31 16-55 


Ol J 


pnn i fit A 


ri\v x cir^ z^Xm v X" XTi VJXvlx. Z* UN L^** 

FINGER METAL-BINDING NU. 


x^xJuiuoo iyi4a z,u4/e-ji io-dj 


817 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.154e-36 125- 
154 PR00193E 19.47 3.919e-18 
179-208 


818 


PR00830 


ENDOPEPT1DASE LA (LON) SERINE 


PR00830A 8.41 9.571e-ll 115- 
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m 
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RESULTS* 






PROTEASE (S16) SIGNATURE 


135 


819 


BL00126 


3 l 5'-cyclic nucleotide phosphodiesterases 


BL00126C 22.07 7.857e-24 528- 

^AO HT AA17£T? 1< 1*} 1 *T1 >l- 1 < 

joy mAJUizob jj^ZZ ,5.714e-l 3 
669-724 BL00126D 25.50 1.173e- 
14 584-623 BL00126B 15.20 
i aaa*_17 ^ao <ia ht aai7/*a 

27.56 3.361e-09 461-498 




Pit AOS 1 1 




JtIvUUjI id VZ.Uo o.o/oe-Zz I /4- 
195 PR00511A 13.59 7.723e-ll' 

1 ^-177 


821 


BL00741 


Guanine-nucleotide dissociation 

ctrmiilfitnrQ f"T)f7A "fomilv oicm 


BL00741B 14.27 2.800e-15 13-36 


822 


PF00780 


Domain found in NIK 1 -like kinases, 

mmtcp /*itiYin fin/) VAnct" T?/*}Ajf 
luuudc uuuu emu jrcuai ivv/ivi . 


PF007801 14.69 4.825e-09 23 1- 

ZOl 


827 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 5.235e-ll 144- 
163 


828 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 9357e-l 1 545- 
586 


829 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A9.37 1.000e-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 
152-189 PD02448E 11.33 9.000e- 
30235-261 PDQ2448F 14.22 
9.654e-25 279-303 PD02448D 
11.48 3.659e-18 197-211 
PD02448G 10.73 7.857e-16 305- 
318 






Guanine-nucleotide dissociation 
sumuiaiors LiA^j iamiiy sign. 


BJL00720B 16.57 4.500e-23 483- 

^AT 


831 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 6.625e~21 143- 

1 1A DI Art 1 /VTD 15 11 A 11 >(. in 

1/4 dLUUIU/15 1j.j1 4«fiX4e-10 
213-229 




RT AA71 ^ 


Mitochondrial energy transfer proteins. 




833 


PR00497 


NEUTROPHIL CYTOSOL FACTOR 


PR00497A 6.92 4.375<?-09 41-59 


BL00229 


Tau and MAP prorJ&s tubulin-biiiding 
uumain pruieins. 


BL002/.: >A -jAJ! 9.565e-10 99- 
i i« 


■ 


BL00421 


Transmembrane 4 family proteins. 


BL00421E 20.?7 2J216e-09 1053- 
1083 


836 


BL00795 


Invohicrin proteins. 


BL00795B 12.41 7.931e-09405- 
445 


837 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 L000e-17 34-53 
rRCwuzOB 15.52 5.846e-16 68-85 
PRD0020D 12.70 2.543e-15 147- 

1/C7 DBAAAOAf* 11 /CXO AQ1*± 11 

10Z rKJUUUZUU li.oo j.4oie-li 
95-107 PR00020E 8.64 6.586V13 

170 
10D-1 fy 


838 


BL50017 


Death domain proteins profile. 


BL50017B 17.60 6.897e-13 1499- 
1515 


839 


PF00850 


Histone deacetylase family. 


PF00850C 14.55 9.542e-09 1352- 
1369 






Ank repeat proteins. 


rrOuuZjA lo.Oj 4.500e-12 44-60 
PF00023B 14J20 7.923e-ll 73-83 
PF00023B 14.20 9.000e-10 139- 
149 PF00023B 14^0 5 500e-OQ 
40-50 


842 


BL01194 


Ribosomal protein L15e proteins. 


BL01194B 13.66 1.000e-40 37-85 
BL01194C 12.35 9^50e-40 103- 
138 BL01194A 18.70 7.632e-38 
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2-37 BL01194D 19.02 2.658e-36 
139-178 


843 


BL00610 


Sodium:neurotransmitter symporter 
family proteins. 


BL00610A 17.73 1.000e-40 40-90 
BL00610B 23.65 1.000e-40 104- 
154 BL00610C 12.94 1.000e-40 
206-258 BL00610E 20.34 l.OOOe- 
40 355-398 BL00610F 29.02 
1.000e-40 454-509 BL00610D 
20.97 6.063e-35 272-325 
BL00610G 12.89 8.588e-13 514- 
537 


845 


BL00143 


Insulinase family, zinc-binding region 
proiems. 


BL00143A 20.91 4300e-20 94- 
121 BL00143C 14.10 5.500e-13 
245-258 BL00143B 14.41 9.053e- 

in i>( i i« 
IU 141-IjO 


846 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


847 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


848 


BL00824 


Elongation factor 1 beta/betaVdeha chain 
proteins. 


BL00824C 14.58 1.000e-40 129- 
167 BL00824D 14.04 6.192e-39 
167-202 BL00824B 9.21 2.080e- 
21 96-116 BL00824E 12.49 
3.333e-19 210-226 BL00824A 
13.78 8.650e-14 19-34 


849 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1.000e-40 12-51 


850 


PD01066 


PROTEIN ZINC FINGER ZINC- 

TTTKTr'mi \jTD*PAT ■otxtt^tvt/^ xttt 


PD01066 19.43 7.316e-24 10-49 


852 


BL01272 


Ghicokinase regulatory protein family 
proteins. 


BL01272B 19.61 6.870e-30 136- 
171 BL01272C 11.68 3.314e-25 
249-274 BL01272A6.49 lJ231e- 

i o no i it 

lo 99-117 


853 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATE 


PD00930B 33.72 9.341e-20 65- 

m 


854 


PD002&9 


PkOTEIN bH3 DOMAIN RE^SaY 
PRESYNA. 


PD00289 9.97 6.850e-ll 140-154 


858 


PR00450 
• 


RECOVERIN FAMILY SIGNATURE 


PR00450C 12.22 3.250e-25 68-90 
PR00450B 11.76 8.125e-23 22-42 
PR00450D 16.58 8.920e-22 92- 
112 PR00450E 12.14 1.581 e- 19 

ii>i ill UDAAylCA/^ if 11 c c r\r\ 

114-133 PR00450G 1533 5.500e- 
19 166-187 PR00450F 12.30 

*k.J /De-ID 1*H/-1jO rKULWDUA 

13.58 1.857e-14 8-23 


860 


BL00027 


T Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 74-1 17 


866 


BL00477 


Alpha-2-macrogIobulin family thiolester 
region proteins. 


BL00477L 23.51 7.480e-20 54-87 


oo/ 


JDJLAJIU/o 


Molybdenum cofactor biosynthesis 
proteins. 


BL01078B 14.20 1.621e-20408- 
429 BL01078A 10.16 2.000e-13 
366-379 BL01078D5.993.455e- 
11 566-576 BL01078C 10.52 
3.793e-ll 501-513 




DUJll 1 1 


Anaphylatoxin domain proteins. 


BL01 177E 20.64 5.800e-24 462- 

489 RTi)1 177P 17 S T33a-10 
416-435 BL01177B 13.61 7.840e- 
16122-138 BL01177D 17.50 
1.900e-15 441-459 


869 


BL01177 


Anaphylatoxin domain proteins. 


BL01177E 20.64 5.800e-24 415- 
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442 BL0ll77C17395.333e-19 
369-388 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e-15 394-412 


871 


BL50007 


Kiasphatidylinositol-specific 
phospholipase X-box domain proteins 
prof 


BL50007A 19.61 1.000e-40322- 
368 BL50007D 19.54 1.000e-40 
589-631 BL50007B 20.90 6.700e- 
36 383-421 BL50007E 25.63 
9.053e-33 748-785 BL50007C 

O til C *>AA« in il CI j< £ A 

5.97 5.200e-19 452-469 


872 


BL00972 


Ubiquitin carboxyl-tenninal hydrolases 
family 2 proteins. 


BL00972D 22.55 3250e-17 90- 

lie 

115 


O /** 




CT1Q rWYKjf A TXT GT/TKT ATTTOT3 


PR00452B 1 1.65 4.250e-09 370- 
386 


9*7*7 
off 


HT AfVMI 
DlAfU t*fl 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 5.500e-13 1343- 
1366 


o/o 




r KULllN b-KJCH JPKO I EIN 3 . 


DM00215 19.43 2.525e-09 52-85 


OBI 

ooi 


r JLMJzoU / 


ArULlrUFKUIEIN E PRECURSOR 
APO-B GLYCOPROTEIN PLAS. 


PD02807E 10.90 4,702e-09 358- 
407 


882 


• PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.188e-37 £47 


885 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 8.071e-09 10-26 


886 


PR00372 


BIOPTERIN-DEPENDENT 
AROMATIC AMINO ACID 
HYDROXYLASE SIGNATURE 


PR00372B 10.30 9.308e-27 225- 
248 PR00372A13397.000e-24 
134-154 PR00372E 12.62 2.125e- 
23 360-380 PR00372C7.90 
3.025e-22 289-309 PR00372F 
13.09 6333e-21 395-414 
PR00372D 1022 1.000e-19329- 
348 


887 


BL00301 


GTP-binding elongation factors proteins. 


BL00301B 20.09 2.800e-24 103- 
135 BL00301A 12.41 4.316e-13 
21-33 


888 


BL00518 


Zinc finger, C3HC4 type (RING finger), 


BL0O518 12.23 1.667e-09 30-39 




PD01066 


Pk V. 1 W ZINC FINGER ZINC- 
FINGV A METAL-BINDING NU. 


PD01066 1 -vb 4.906e-25 6-45 


890 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 7.652e-09 113- 
123 


OA*"* 

892 


t>t t\ 1 not 

BL01022 


PTR2 family proton/oligopeptide 
symporters proteins. 


BL01022B 22.19 6.016e-14 72- 
118 BL01022E 23.51 1.173e-12 
472-508 BL01022A 11.58 9.135e- 
1242-61 BL01022D9.423.455e- 

til AA Hi 

11 199-212 


893 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 *5.529e-10 360- 
383 


894 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


895 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237B 13.50 9.100e-14 116- 
138 PR00237F 13.57 1.360e-13 
312-337 PR00237G 19.63 9.069e~ 
13 353-380 PR00237E 13.03 
7.120e-12 243-267 PR00237D 
8.94 4.150e-ll 194-216 

rJvUUZo / A X I mo 4.J /De-1 1 oj- 

108 


896 


BL00129 


Glycosyl hydrolases femily 3 1 proteins. 


BL00129D 16.76 8.258e-26 634- 
678 BL00129A 26.21 1.720e-25 
384-430 BL00129E 22.60 4.857e- 
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23 698-734 BLG0129C 15.12 
1 . / MJe-zz Dyo-OZ4 J9LAlU129i3 

19.19 5.891e-18 495-522 

£5JLUUlZ7r 20.17 / J*rDe-l3 514- 

852 


897 


BL00598 


Chromo domain proteins. 


BL00598 14.45 1.220e-13 9-31 


898 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 6.000e-09 396-405 


899 


PD01101 


INHIBITOR HEAVY CHAIN 

rUAWXTTJT TXT 


PD01101B 21.53 1.000e-40274- 
327 PD01101D 24.45 1.000e-40 
457-512 PD01101A 
23 83-117 PD01101C 12.69 
1.237e-l0 36o-3o6 PD01101E 
6.73 7.750e-12 566-576 


nrvn 




P"D rVTTZ TXT DUnQDU ATA CC DOO A < evn 

SIGNATURE 


TlDAA^flAA It £.\ C ATHa Art "7 1 CI 

PR00600A 1 1.61 5.979e-09 3 1-52 


901 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU, 


PEK)1066 19.43 8.116e-31 24-63 


903 


BL01II5 


GTP-binding nuclear protein ran proteins. 


BL01115A 10.22 U09e-ll 21-65 


906 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e-13 539- 
572 DM00215 19.43 4.750e-12 
549-582 DM00215 19.43 9.824e- 
11 551-584 DM00215 19.43 
2.929e-10 548-581 DM00215 
19.43 4.054e-10 550-583 
DM00215 19.43 5339e-10552- 
585 DM00215 19,43 7.107e-10 
544-577 


907 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 6.276e-12314- 
332 


908 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1125- 
1156 


909 


BL00107 


Protein kinases ATP-binding region 

pro win^ . 


BL00107A 1839 5.950e-17 1118- 


910 


BL0010V: v * 


Proteiu kinases ATP-binu regiu. 
pro reins. 


BL0O107A 18.39 8.560e-i3T50. • 


yi i 


RT flAI <Y7 
DLAJvlvf 


rToiein Kinases a i r-Dinaing re^j n 
proteins. 


BL00107A lo.39 8.560e-l3 150- 
181 


912 


PF00856 


SET domain proteins. 


PF00856A 26.14 4.553e-ll 243- 
280 


913 


PF00628 


PHD-finger. 


PF00628 15.84 6.400e-l3 197-212 


914 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 1.000e-27435- 
459 PR00962G 15.71 4.086e-26 
593-618 PR00962Bll.98 9.122e- 
26296-319 PR00962A 13.28 
6.143e-22 15-34 PR00962C 8.00 
4.000e-21 348-369 PR00962F 
12.39 9.769e-21 552-572 
PR00962H 13.32 2.636e-20 623- 
643 PR009621 11.68 9.786e-20 
692-712 PR00962E8.812.915e- 
18 515-534 


915 


PR00962 


LETHAL(2) GIANT LARVAE 


PR00962D 10.40 1 .0O0e-27 365- 

Doy JrKUUTOZtJ lJ./l 4.Uo0e-Zo 

523-548 PR00962A 13.28 6.143e- 
22 15-34 PR00962C 8.00 4.000©- 
21 278-299 PR00962F 12.39 
9.769e-21 482-502 PR00962H 
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13.32 2.636e-20 553-573 
PR009621 11.68 9.786e-20 622- 
642 PR00962E8.81 2.915e-18 
445-464 




T)T AA tli 

BL00I34 


Serine proteases, trypsin family, histidine 
proteins. 


TjT /VA 1 *\ A A 11 ft/ £* oO£» "1 A r\r\ 

BL00134A 11.96 5.886e-1490- 
107 


917 


BL00478 


LIM domain proteins. 


BL00478B 14.79 8.393e-13 211- 
226 BL00478B 14.79 6.712e-10 
271-286 




TJT> AAA A A 


TinT X /flO TT T\ jf /"\T "TO TIT) /"YTTJTXI 

WILM o lUMOURPROrfslN 
SIGNATURE 


TyTi AAA A AT~\ A A A C Tin« AA ATI 

PR00049D 0.00 5.729e-09 973- 
988 


922 


BL00150 


Acylphosphatase proteins. 


BL00150 25.33 1.000e-40 37-84 


924 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 8.C63e-09 79- 
113 


925 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072D 30.08 2.837e-24 280- 
331 BL00072E 24.12 8.200e-24 
368-41 1 BL00072C 25.30 7.873e- 
20 226-267 BL00072B9.48 
6.049e-12 183-196 


927 


BL00237 


G-protein coupled receptors proteins. 


BL00237C 13.19 1.692e-13 229- 
256 BL00237A 27.68 6.657e-13 
90-130 BL00237D 11.23 9.571e~ 
13 290-307 


928 


BL01033 


Globins profile. 


BL01033A 16.94 7.923e-18 25-47 
BL01033B 13.81 l.OOOe-15 93- 
105 


929 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 8.714e-13 203- 
253 


. 932 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e-10 353- 
397 BL00415N 4.29 2.1 17e-09 
63-107 BL00415N4.29 3.628e-09 
57-10 1 BL004 1 5N 4.29 5.664e-09 
347-391 


933 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1.000e-40 46-85 
FD02448B !.00t-40 S5- 
ij3 PD02448C 13.62 l.OOOe- ^) j 
152-189 PD02448E 11.33 9.000**- 1 
30 223-249 PD02448F 14.22 
9.654e-25 267-291 PD02448D 
11.48 3.659e-18 197-211 
PD02448G 10.73 7.857e-16293- 
306 


934 


DM00191 


w SPAC8A4.04C RESISTANCE 
SPAC8A4.05C DAUNORUBIQN. 


DM00191D 13.94 9.083e-10 136- 
175 


935 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01 1 15A 10.22 4.696e-10 67- 
111 


936 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 8.138e-14865- 
895 


937 


PR00762 


CHLORIDE CHANNEL SIGNATURE 


PR00762A 14.22 4.000e-22 183- 
201 PR00762C9.291.000e-21 
268-288 PR00762E 12.07 3.250e- 
20 520-537 PRO0762D 11.29 
1.000e-l 9 470-491 PR00762F 
15.12 1.429e-19 538-558 

234 PR00762G 14.13 3.455e-17 
577-592 


938 


BL00027 


'Homeobox 1 domain proteins. 


BL00027 26.43 9.500e-25 291-334 


939 


DM01111 


4 kw PHOSPHATASE 


DM01 11 IE 17.28 1.568e-10248- 
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TRANSFORMING 61K PDF1. 


297 DM01 11 IE 17.28 5.168e-10 
659-708 DM01 11 ID 16.76 
5.263e-09 279-325 DM01 11 1M 
10.67 8.674e-09 91 1-935 




T)T A/\1 AT 


Protein kinases ATP-binding region 
proteins. 


BL00107B 1331 l.OOOe-14293- 
309 BL00107A 18.39 6.760e-13 

*V*A O^A 

229-260 


942 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.832e-ll 543- 
597 


A/11 

943 


PD01066 


PKUlrUJN FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 3.500e-35 8-47 


A*< c 

945 


BL00989 


Clathrin adaptor complexes small chain 
proteins. 


BL00989B 26.51 1.000e-40 66- 
117 BL00989A 11.66 1.000e-13 
5-19 


946 


PR00178 


T? A '1*1 'V/ A OTTA T3TXrP\TXT/™' TiTl /"\ r r'F?TVT 

FATTY ACID-BINDING PROTEIN 
SIGNATURE 


t\t» r\e\ * /tnn . r% et\ t\ m% _ r\.r\ a r t\ 

PR00 1 78D 1 3 .52 9.57 1 e-09 450- 
469 


947 


tjt f\s\ no 

BL0Q178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.1 1 4.857e-09 713- 
724 


948 


PF00628 


PHD-finger. 


PF00628 15.84 8.412e-14 201-216 


951 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.050e-10 180- 
230 


952 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 4300e-ll 26-49 
PR00926F 17.75 6.348e-09 134- 
157 


955 


PF00109 


Beta-ketoacyl synthase. 


PF00109 13.08 2.846e-12 342-357 


957 


PR00069 


ALDO-KETO REDUCTASE 
SIGNATURE 


PR00069A 16.01 8.826e-24 26-51 
PR00069B 1133 L514e-17 86- 
105 PR00069C 16.03 8.816e-14 
155-173 


958 


PF00583 


Acetyltransferase (GNAT) family. 


PF00583A 12.53 5.500e-10 631- 
642 


961 


PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 


PR00328A 10.62 8.740e-10 7-31 


962 


BJL00354 


TTl T J TT1 M fS 1/ T"WT A 1_ * J • _ 

HMG-T and HMG-Y DNA-binding 

doniiu* proteins ( A+T-Lwci:;. 


BL00354A 3.83 9.43 8e- 10 1489- 


yt>5 


ni f\f\ ' . . ■» 


TTJk A f~\ T J tr» jf TVS*' I . 

HMG-I ana HMG-Y D*?A-Dmaii*g 
domain proteins (A+T-hock). 


BLO0354A 3.83 9.43 8e- 10 14c-. 
1499 


. 964 


TIT AA/WU 

BLU0027 


"Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 53-96 


965 


PF00992 


Troponin. 


PF00992A 16.67 2.42le-09 581- 
616 


966 


PR00515 


5-HYDROXYTRYPTAMINE IF 
RECEPTOR SIGNATURE 


PR005I5D7.91 5.741e-09 13-33 


A£T 

967 


BL00579 


Ribosomal protein L29 proteins. 


BL00579B 21.99 5.065e-21 164- 
194 


(VTA 

970 


BL00504 


Fumarate reductase / succinate 
dehydrogenase FAD-binding site 
proteins. 


BL00504C 18.68 2.227e-24 34-59 
BL00504D 10.43 7.261e-21 75-93 


973 


PF00580 


UvrD/REPhelicase. 


PF00580A 13.37 4.720e-09 249- 
271 


974 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456F5.86 l.OOOe- 10 242-254 


975 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.429e-22 99- 
139 


976 


BL00031 


Nuclear hormones receptors DNA- 
omaing region proteins. 


BL00031A 19.55 7.158e-33 60-93 

TJT AAA*, ITS T> < fArta OO t\A 

126 


977 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8^00e-16 196-209 
e PD00066 13.92 8.200e- 16 336-349 
°PD00066 13.92 2.385e-15 476-489 
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PD00066 13.92 9.308e-15 252-265 
PD00066 13.92 2.800e-14 448-461 
PD00066 13.92 4.600e-14 392-405 
PD00066 13.92 5.200e- 14 280-293 
PD00066 13.92 4.000e-13 224-237 
PD00066 13.92 4.429e-12 308-321 
PD00066 13.92 9.571e-12 420-433 
PD00066 13.92 6.870e-ll 168-181 


978 


BL00721 


Formate-tetrahydrofolate ligase proteins. 


BL00721B 13.21 1.000e-40 346- 
401 BL00721D 13.90 1.000e-40 
538-592 BL00721E 13.46 I.OOOe- 
40 597-646 BL007211 18.79 
2.500e-40 814-860 BL00721H 
2120 8.239e-39 763-814 
BL00721A 1531 9.7l9e-32 287- 
321 BL00721C 16.92 4.000e-30 
498-535 BL00721F 15.96 8.232e- 
27 660-702 BL00721O7.97 
3.017e-10 721-734 


981 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 2.552e-09 180- 
201 


982 


BL00869 


Renal dipeptidase proteins. 


BL00869C 12.58 3.172e-19 59-95 
BL00869E 13.12 9.129e-18 120- 
157 BL00869J 15.60 6.032e-17 
270-310 BL00869H 11.08 1.840e- 
16219-242 BL00869G 13.55 
2.543e-16 192-214 BL00869F 
12.77 7.031e-14 157-192 
BL008691 12.92 3274e-12 242- 
270 BL00869D 14.02 5.282^10 

10 31-61 


983 


PR00196 


ANNEXIN FAMILY SIGNATURE 


PR00196F 13.89 2.125e-09 92-108 


984 


BL00485 


Adenosine and AMP deaminase proteins. 


BL00485D 30.82 2.427e-10 154- 
209 



* Results include in order: accession number subtype; raw score; p-vahie; position of signature in amino acid 
sequence 



TABLE 4 



SEQ ID 


PF AM NAME 


DESCRIPTION 


p-valne 


PFAM 


NO: 






SCORE 


2 


ig 


Immunoglobulin domain 


3.9e-17 


60.3 


3 


HSP90 


Hsp90 protein 


0 


1548.4 


6 


tsp 1 


Thrombospondin type 1 domain 


0.002 


22.1 


7 


7tmJ 


7 transmembrane receptor (modopsin 


6.7e-08 


27.3 






family) 






9 


PWWP 


PWWP domain 


8.1e-16 


66.0 


12 


Clq 


Clq domain 


1.7e-26 


101.5 


13 


Clq 


Clq domain 


2e-20 


8U 


14 


Aajtrans 


Transmembrane amino acid 


2.7e-42 


153.9 






transporter protein 








E1-E2 ATPase 


E1-E2 ATPase 


63e-124 


412.2 


16 


trypsin 


Trypsin 


l^e-87 


278.6 


17 


ig 


Immunoglobulin domain 


7.6e-12 


43.2 


18 


lectin c 


Lectin C-type domain 


0.0003 


21.2 


20 


Alpha_L_fucos 


Alpha-L-fucosidase 


l^e-217 


736.5 



185 
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S£QQ> 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


22 


pkinase 


Eukaryotic protein kinase domain 


3.3e-87 


303.1 


23 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


24 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


25 


ank 


Ank repeat 


5.5e-14 


59.9 


27 


pkinase 


Eukaryotic protem kinase domain 


1.5e-100 


347.4 


28 


spectrin 


Spectrin repeat 


4e-57 


203.2 


29 


spectrin 


Spectrin repeat 


4e-57 


203.2 


30 


WD40 


WD domain, G-beta repeat 


12e-07 


38.8 


33 


rrm 


RNA recognition motif. 


l.le-17 


722 


34 


Hill 


RNA recognition motif. 


l.le-17 


122 


36 


7tm_l 


7 transmembrane receptor (rhodopsin 
f5amily) 


3e-36 


117.3 


37 


ank 


Ank repeat 


5.9e-25 


96.3 


38 


SRF-TF 


SRF-type transcription factor 


1.4e-36 


133.9 


40 


alk_phosphatase 


Alkaline phosphatase 


0 


1034.9 


44 


zf-C2H2 


Zinc finger, C2H2 type 


8.6e-103 


354.9 


45 


sugarjtr 


Sugar (and other) transporter 


3.1e-08 


40.3 


47 


7tm_2 


7 transmembrane receptor (Secretin 
family) 


6.4e-79 


275.6 


50 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-98 


341.0 


51 


filament 


Intermediate filament proteins 


l>2e-176 


600.3 


52 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


2.7e-10 


37.7 


53 


Cadherin_C_ter 
m 


Cadherin cytoplasmic region 


1.9e-94 


327.2 


54 


S_100 


S-100/ICaBP type calcium binding 
domain 


5.2e-18 


73.3 


58 


inositol P 


Inositol monophosphatase family 


5e-13 


49.8 


59 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


8.8e-46 


147.6 


60 


Kunitz_BPTI 


Kunitz/B ovine pancreatic trypsin 
inhibito 


3.7e-47 


148.6 


62 


DAD 


DAD family 


2,5e-74 


260.3 


63 


MOZ SAS 


MOZ/SASfemily 


5.9e-133 


455.1 


• 64 


MOZ SAS 


MOZ/SAS femily 




42' * 


>— -.— . 

oi 


ras 


Ras i^ail/ Is 


S08J 


67 


Hamlpjike 


Haml family 


't 7o49 


176.7 


68 


7trn_l 


7 transmembrane receptor (rhodopsin 
family) 




126.1 


70 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-112 


387.3 


71 


Peptidase_M41 


Peptidase family M41 


12e-110 


381.0 


72 


abhydrolase 


alpha/beta hydrolase fold 


9.8e-05 


26.5 


81 


K tetra 


K+ channel tetramerisation domain 


0.022 


-16.8 


82 


pkinase 


Eukaryotic protein kinase domain 


5e-49 


1763 


84 


AAA 


ATPases associated with various 
cellular act 


Ue-77 


271.3 


85 


homeobox 


Homeobox domain 


1.4e-28 


108.3 


87 


TGF-beta 


Transforrning growth factor beta like 


6.7e-68 


210.2 


91 


raito carr 


Mitochondrial carrier proteins 


4.6V57 


198.5 


95 


adenyiatekinase 


Adenylate kinase 


l.le-15 


60.0 


96 


ig 


Immunoglobulin domain 


4.1e-20 


69.8 


99 


CNH 


CNH domain 


3.4e-120 


412.7 


100 


homeobox 


Homeobox domain 


7.4e-32 


119.3 


101 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-47 


170.8 


1 nr> 


ZT-C2H2 


Zinc finger, C2H2 type 


4.4e-89 


309.4 


103 


dynamin 


Dynamin family 


1.4e-150 


513.6 


104 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


105 


lectin c 


Lectin C-type domain ' 


4.2e-15 


63.6 


108 


metalthio 


Metallothionein 


2e-25 


97.9 



186 
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SEQID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


112 


HSP20 


Hsp20/alpha crystallin family 


2.6e-20 


77.7 


He 

115 


EF TS 


Elongation tactor is 


3.8e-63 


221.1 


116 


sugar 


Sugar (and other) transporter 


4e-63 


223.1 


1 to 

118 


catalase 


Catalase 


0 


1158.9 


i in 

119 


UCH 


Ubiquitin carboxy 1-terminal 
nyaroiase, iamii 


le-10 


24.4 




metalthio 


Metal lothionein 


z.oe-25 


97.4 


125 


adh short 


short chain dehydrogenase 


1.6e-45 


164.6 


126 


KKAB 


KJRAd box 


7.9e-25 


95.9 


1 IT 

127 


(i-alpna 


G-protein alpha subunit 


le-249 


843.0 


128 


mito carr 


Mitochondrial carrier proteins 


2e-65 


227.2 


131 


EF1BD 


EF-1 guanine nucleotide exchange 
domain 


4.9e~53 


189.6 


132 


GYF 


GYF domain 


4.9e-28 


106.6 


133 


GYF 


GYF domain 


4.9e-28 


106.6 


134 


lipocalin 


Lipocalin / cytosolic fatty-acid 
binding pr 


2.1e-33 


119.1 


135 


pkinase 


Eukaryotic protein kinase domain 


3.3e-86 


299.8 


136 


auk 


Ank repeat 


2.2e-29 


111.1 


137 


IL8 


Small cytokines 
(intecrine/chemokine), inter 


3.1e-18 


65.2 


139 


pyridoxal_deC 


Pyridoxal-dependent decarboxylase 
conse 


0.00011 


19.0 


140 


cadherin 


Cadherin domain 


1.3e-88 


307.8 


142 


efhand 


EF hand 


5.7e-33 


123.0 


143 


Acyltransferase 


Acyltransferase 


2e-29 


111.2 


146 


cytochrome^ 


Cytochrome c 


1.7e-33 


124.7 


147 


pkinase 


Eukaryotic protein kinase domain 


2.3e-86 


300.3 


148 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


1.7e-09 


45.0 


149 


aldo ket red 


AJdo/keto reductase family 


7.4e-189 


640.8 


150 


homeobox 


Homeobox domain 


3.2e-08 


38.7 


151 


PseudoU_synth_ 
1 


tRNA pseudouridine synthase 


4.7e-57 


203.0 


152 


abh-droh-se 


alpha r>eta hydrolase fold 


1 1.7e-?.! 
l.ie-09 






PDZ 


PDZ domain (Also * - :w i as DKiv or 
GLGF). 


150 . 


PHD 


PHD-nnger 


7.6e-15 




157 


f*3 


Fibronectin type in domain 


0.015 


21.9 


158 


homeobox 


Homeobox domain 


2.7e-27 


104.1 


160 


PWI 


PWI domain 


3.9e-24 


93.6 


162 


DnaJ 


DnaJ domain 


2e-06 


34.8 


164 


Cbl_N 


CBL proto-oncogene N-terminal 
domain 


8e-117 


401.5 


166 


metalthio 


MetaUothionein 


3.1e-26 


100.6 


167 


LRR 


Leucine Rich Repeat 


0.00069 


26.3 


169 


fibrinogen_C 


Fibrinogen beta and gamma chains, 
C-term 


5J&-180 


611.4 


170 


fibrinogen_C 


Fibrinogen beta and gamma chains, 
C-term 


53e-180 


611.4 


1 T1 
1/1 


fibrinogenjC 


Fibrinogen beta and gamma chains, 
u-tenn 


le-149 


510.8 


1 13 


homeobox 


Homeobox domain 


1.5e-29 


111.6 


VIA 


r YVfc 


r Y Vis zmc linger 


7.4e-28 


103.8 


175 




GRIP rinmain 

VJJLVIX Ullllln III 


j.ye*uo 




182 


pkinase 


Eukaryotic protein kinase domain 


3.4e-7i 


250.0 


185 


CAP GLY 


CAP-Gly domain 


5.6e-51 


182.8 


186 


TBC 


TBC domain 


2.2e-50 


180.8 


187 


TBC 


1BC domain 


2.2e-50 


180.8 



187 
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S£QH> 
no: 


PFAMNAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


IRA 
J oo 




ruz* uomain iaiso Known as JDriK. Or 
fiT GF^ 

VJ1-/VJ J ^. 


ne-i j 


ft 


180 


Kelch 






16<; 6 




Trf»TW»TtlVO*ITTl 




^ R«-171 








Rip<:lfp f9P#»-9Q1 r?nm a in 


0 0016 


1R ^ 

loo 


100 


iff 


TfTtmiTrmolnHiiltn f^rtm ain 
niniiuii y*K muti nil UUULcUU 


S 0a_1O 


66 1 


202 


EGF 


EGF-lflce domain 


3.4e-54 


193.5 


xAjj 


ucIOU 


i reioii ir-rype,} aomain 


1a OA 


AC f 




TUP 


i.i>c uomain 


o.De-jo 


11 A A 


90^ 


Eif|Hnn 


PI? lianst 

xsr nana 


u.uvyo 


99 a 


906 


i ork._on aiiiic i 


oiow voitage-gaieu potassium 


U.UU3 1 


0.1 


907 


UCXUU 


TVofVtil f&-+\rr\t±\ Hnrnatn 

ireiou ^r-uype ^ domain 


9 Oo_/1Q 

z.ye-4o 


191 9 
1 /J. / 


900 


1} tV»r»c Amal CIO 
IUOOSOIIlcU_0 J J 


O tkncnmal ntvt+oi-n Gil VC 1 Q 

iUDosomai proicin o i j/o i o 


1 9<»_9B 


9TW T 


910 


hemopexni 


xiemopexjui 


1 'ia^m 

i .Je-oz 


991 < 


213 


TBC v 


TBC domain 


2.5e48 


174.0 


91 * 


Basic 


Myogenic Basic domain 


4.3e-50 


179.5 




KiDosomai_i^z4 


Jvuw motii 


S.2e-23 




222 


m3 


Fibronectin type HI domain 


73e-141 


481.4 


111 


conlin_AJjr 


Cofllin/tropomyosin-type actin- 
binding pr 


9Je-47 


168.8 


224 


efhand 


EFhand 


6.1e-06 


33.2 




Pterin_4a 


Pterin 4 alpha carbinolamine 
dehydratase 


9.3e-42 


152.1 




ABC tran 


ABC transporter 


4.1e-110 


379.2 


11 i4 

234 


bl Derrz DerF 
2 


£1 family 


3.7e-90 


312.9 


235 


El DerP2 DerF 
2 


El family 


1.6e-48 


174.6 


237 


PMP22__Clauain 


PMP-22/EMP/MP20/Claudm family 


1.7e-25 


98.1 


238 


Opiods_neurope 
P 


Vertebrate endogenous opioids 
neurope 


1.8e-159 


543.2 


910 
Z35r 


eiroa 


Eukaryotic initiation factor 5A 
hypusme 


5.9e-104 


358.8 


940 


/ — .„LUO OXIQoSC 


riavin conuunu g aoime oxioase 


2.5e-ll 


37.8 




11-021X2 


/.: _ • J.i- u^ci, LZnz xypc 


2.1e->V 1 ?43.6 


244 


Band 7 


SPFH domain / Band 7 femiry 


2Je-53 


190.7 




ank 


AnK repeat 


1.6e-88 


307.5 






/.mc ti^^er. uznz type 


6.7^49 


175.9 


247 


actin 


Actin 


2Je-42 


1403 


1>IQ 


ERJunaenjrecep 
t 


ER lumen protein retaining receptor 


2.4e-155 


529.5 


9^A 


r JVLr^x L*loUGin 


i^iyix-z^ jiMr/ MJc^du/uiaucun tamily 


2^e-38 


140.9 


9^9 




i/Oiiagen mpie neux repeat (/u 


1.4e-13 


58.6 


95S 


P9 




0.052 


7.8 


257 


CAP GLY 


CAP-Gly domain 


1.4e-20 


81.8 


960 


TXTTVif A 


W/T^ Haiti aim W^i* umaa^ 

wu uomain, Lr-Deta repeat 


9.9e-62 


218.5 


961 


WTI40 


wu uomain, u-oeta repeat 


9.9e-62 


218.5 


969 


WT140 


wu aomain, o-oeta repeat 


9.9e-62 


218.5 


96^ 


f»/\fi1iri A TlT? 

conun_Ai-'r 


^^uiin/iropomyosm-'type aenn- 

ommng pr 


7.8e-21 


82.6 




PiV\Ao/vma1 T 14 
IvlLHJ SOlTI ft l_-L» 1 f 


PtKAQAmal TYiyvtofn T 1A«\/T 9? a 

xuovsoniai pruicin .Lriip/jbxje 


9.2e-10 


40.6 


265 


^APA 


oapu5iu /\~iypc Qoniain 


4.4e-27 


103.4 


266 


SAPA 


Saoosin A-tvoe domain 


4.4e-27 


103.4 


267 


ABC tran 


ABC transporter 


9.5e-39 


142.2 


269 . 


Ribosomal_L14 


Ribosomal protein L14p/L23e 


62e-62 


219.2 


270 


abhydrolase 


alpha/beta hydrolase fold 


0.042 


-3.3 


272 


ras 


Ras family 


4.3e-87 


302.8 



188 
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SEQID 
NO: 


PFAMNAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


273 


mn 


RNA recognition motif. 


0.074 


14.6 


275 


lipocalin 


LiipocEiin / cyiosoiic iauy-acia 

p TP fling pr 


1 (a A 1 

z.je-4l 


\ A£L A 

140.4 


0*7/; 
z/O 


TBS 


ivas ranxiiy 


1 1 o_A7 

i.ie-o/ 


ZjO.j 


777 




U D1CJU1UI1 taiITHJXyi~lCIllLLIlal 

hv/tmlncp fa mil 
fly ILTUJiloC, ifliJMi 


1 7a^1A7 




019 
Z /o 


OlrUV.1 


QTAPT* rlnmflfn 


j.zc-uy 




070 

z/y 




vwu umuiiiiij u ucw icpcRu 


1 8a-77 
I .oc-Z / 


IflA 7 


Z5Z 




0"P<lU/fl UUIUalll 


7 Ra_77 

/.oe-zz 


50. U 


7R7 
ZO/ 


AnH t\l*ft1ift»«>i" 

Aflu piuiiicroi 


D1U1 inllJlijr 


I .Zc- 1 V 1 




7RQ 
Z07 


rvJVrvD 


lfRAR Hoy 


7 1*Ol 


87 ft 
oZ.o 


zyj 


7tm 1 


/ mm 51 iicmui one rcvcpiur 


j.je-/j 


7S£ ^ 
ZjO.O 


zyj 




oxj 1 mjiiictiii 




1 13.Z 


296 


Pyridox_oxidase 


Pyridoxamine 5 -phosphate oxidase 


Ue-76 


268.0 






kina recognition moui. 


D.4C-4J 


1A7 Q 

loz.y 


zyo 


Ubie_methyltran 


uDLtvv^vji^D memyuransierase iamiiy 


iC Of* A< 

o.je-OD 


-yo*j 


299 


Ubiejnethyltran 


ubiE/C0Q5 methyltransferase family 


0.0024 


-118.1 


301 


Cyt_reauctase 


rALVNAJJ-Dmamg uytoenrome 
reductase 


7.7e-ol 


215.5 


302 


G-patch 


G-patch domain 


3.1e-14 


60.7 


307 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


7.7e-43 


138.2 


1AO 


T>TT 

Jrxl 


PH domain 


U.UUJD 


n o 
1 /.o 


1 1 A 


/tm_l 


7 transmembrane receptor (rhodopsin 
family) 


1.4e-84 


OTA O 

270.8 


711 
Jl 1 




Rnodanese 


Rhodanese-like domain 


J.3e-04 


ZzO./ 


J1Z 


TUDUim 


1 uDuiin/r csz. iamny 


4.ye-zoo 


yo3.o 


11/1 
Jl4 


CT n>T7/t 

oUKr4 


oUKr4 iamiiy 


l.ze-lyy 


o7o.o 


Jz5 




impB/mucB/samB family 


CO 


OAT < 

207.5 


1*>T 
JZ7 


cadherin 


Cadherin domain 


4.3e-yl 


"3 1 £ A 

316.0 


17Q 

Jzy 


VIA/"* 


nal domain 


z.le-zo 


1 AT O 

107.5 




IP trans 


Phosphatidylinositol transfer protein 


a c« no 

o.5e-9o 


335.7 


JjZ 


TT7TTC 


i ransenpuon iacror o-ix ^ i r iio ) 


o.oe-uj 


7Q 1 


Jj / 


zi-^zriz 


Ztmc linger, l*zxiz type 


j.oe-oi 


ZlO.O 




A TP C 


/viiv synxnase reiaieu proxein 


/Lo_17 


17H 7 




innexin 


Annexin 


4.oe-ou 


\l ■ : 




l> LHXIirn 1T| 


Stammm family 


i.oe-yu 




.54/ 


i\i Dosoinai L<io 


Ribosomal protein L16 


4.oe-w 




348 


lactamase B 


M etallo-beta-lactamase superfamily 


0.012 


-6.0 




fifhand 


cr nana 


z.5e-14 


Ol.O 


JJJ 


lectin c 


Lectm C-type domain 


i»je-yj 


lO 1 

JZ. 1 


354 


WD40 


WD domain, G-beta repeat 


2.2e-18 


74.5 




lipocalin 


Lipocalin / cytosolic fatty-acid 
bmdingpr 


o.Je-10 


io.J 


1A7 

JOZ 


ACctyiurttxisi 


Aceiyiiransierase ^oinai ) iamiiy 


A AA1 A 


OA O 

Z4.y 


JOJ 


u\JN/v-syni_i 


uvina synmetases Class i (i, i>, M ana 
V J 


4.oe-ioj 


A78 7 




OUllnmaC 


qui lauioP 


0. 1 c-ZZo 


77fl f* 


JUO 


^TART 

O J./VCVJI 


^TART Hnmatn 
Olnlvl QOmaDl 


j.oe-i i 


^ft *\ 


360 

JU7 




Cru&aiyouc protein janase aomain 


7 /1a. in 




370 

J IV 


Af^RP 


/wyi Duuung proicin 




lOO 7 


371 
j / 1 




£u&aryouc prozem Kinase domain 


i.oe-y4 


177 S 


171 
0 1 j 




cvjr-iiKe aomam 


z.oe-iz 




17^ 
J ij 


^f-T*7H7 
zx-^ztiz 


7 inn frnfVM> 07XJ7 KmA 

Ziinc nnger, uzxiz rype 


0 7o_£A 


77^ 4 


377 


KRAB 


KRAB box 


3 7e-27 


103.7 


379 


SET 


SET domain 


7.3e-61. 


215.6 


380 


Glyco transf_8 


Glycosyl transferase family 8 


0.0028 


-40.1 


381 


zf-C2H2 


Zinc ringer, C2H2 type 


4.3e-06 


33.7 


383 


Glyco_transf_8 


Glycosyl transferase iamiiy 8 


0.0028 


-40.1 



189 
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SEX} ID 
NO- 


DDI TkJf XT A ~KJCO 




p-value 


PFAM 

v/XvJC/ 


384 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


385 


TBC 


TBC domain 


0.017 


-66.6 


389 


Glvcos transf 2 


Glvcosvl tran sferases 


1 3e-15 




390 


Na Ca Ex 


Sodium/cal c ium exchanger protein 


3.9e-105 


362.7 


jyi 


fhi 


FiHrnnp^f+in hm« III Hnmoi'n 
Fiiiiuucvuii type 1X1 UUulaUl 


4 1e.1fl9 


1S9 & 


109 


fii3 


Ffhrfin A/^fin fvr\A ill Hnmain 

nuruiicciia lypo uu uuiuaiii 


1 4**-4S 


1QJ.U 


303 


fh3 


FiKronpcHn tvTV* TTT Hrtmarn 
nui viig\*uii type jxi tJtuiiMiii 


1 4e-4S 


1UJ.U 


304 


ldl recent h 


' T rtw^lpn citv 1 iTVYntYvf^in tprpntAr 
j^uvr^iciidiLjr upvpiuiciii rcuypivx 

repeat 


7 1p>40 


17S R 

1 / J.O 


395 


Ribosomal L30 


Ribosomal nmtein T lftn/I /7e 


0.0023 


16 0 


396 


Oxvsterol BP 


Ovvot^ml-nfnHfrnT nrntpin 


1.5e-94 


327.5 


397 


RDS_R0M1 


Peripherin/rom-1 


2.9e-33 


123.9 


1QQ 


lavlnlllcac O 


IVA6ulUO*PcI^iaClnninSC SUpciJLallluy 


1 4a_10 


141 ^ 


4ft9 




l7_n/w iiMnoin 

r-Dox Quiiiain. 


ft ftftft7 


9ft 1 




f*T P nmtAflCP 


protease 




99A 9 


4ftS 


Ae 


1} itvtc/*tmi>l rwYtpin T 1^A» 
JnJ.UU5UIllal prULCUl LjOjj\C 


Do- / / 


9/%0 ft 

zoy.u 


406 


LIM 


T 7A4 dfvmafn contain in a ttmtpinc 

' - 1 1 vl UUllJOUl 1 jt i M 1 II 1 II IK piUlGUlS 


ft 0ftft9 1 


9ft 7 


410 


tRNA--svnt 1c 


UxlNA syOUlCtaaCS ClaSS A \d alia v^^ 


1 r-91/5 


700 ft 


41 1 
*ti i 




rN ucicouuyiifaiisicnisu uomaui 


1 0a-1£ 


A7 ft 


419 
•HZ 


DP AH 




ft ftftftl/t 


17 9 


did 


TOTF04 


i/umain ox un Known iudcuoii Jjuryf 


ft ftnm 1 

U.VUUJ 1 


zo.y 


HI J 


til Knl in 


TnKii1fri/[7tc7 ^rnii'lti 

iuDiLun/J7isx« lainiiy 




Q71 7 




SFT 

Of> 1 


oc X uoinalil 




9ftl S 


421 


WD40 


WD domain, G-beta repeat 


6.1e-29 


109.6 


491 
*tZJ 


7fLr">H9 
zi~\_,znz 


7i-n/» -Pin nor /^OTJTO hmo 

Zanc imger, y^zru. type 


ioe-jy 


i«w.y 


49 d 


piUIlaSc 


CruKoTyouc protein Kinase aomam 


o. ye-/ j 


ZOl.O 


**Zo 


T TKA 


LIM domain containing proteins 


1 .oe-34 


izo./ 


411 


kazal 


Kazal-type serine protease inhibitor 

/inmoni 

aomaui 


j./e-io 


7Q O 


419 


OJT1X 


oii/ iiuiiiuiugy """inin x. 


1 .'fC-O / ■ 


lOft 4 


411 


7f-T*9H9 
Z1-VZJ1Z 


z«inc nngcTy \-*£riz, type 




4yz. / 


414 


I aii 


Dag fami %\* 

XvaS tnill liy 


ft ft19 


ft 


416 


F1-F9 ATPnoA 


F1-P9 ATPs*c<» 


1 .DC- 1 1 1 


ioi n 


417 


Q\IA nnl A 


XU.N/1 : ; ; lUCTaSC ctipilil 3LUJU.ii! r 


t ; 


1 ATT 7 
1U/ /./ 


438 




PiO anger 


1.60-11 


51.7 


410 


l6ctin c 


j^ccun. ^-nype oomam 




1 1 J.j 


440 


zf-C2H2 


Zinc ringer, C2H2 type 


Lle-65 


231.6 


441 


arrestin 


AiTesun ^or o-anugenj 


z.ye-zD4 




449 


oininoiran_j> 


Arnrnotransierases cjass-m 
pynauAai'"pii(/ 


o^e-oU 


231.1 


441 




lJUl^UluIl Cai lAJAyi^uaiDinal 
hvdmltiQPQ "fiimil 

11 Jr \*1 \J 1 tPvO loll 111 


O.JC-1Z 


3Z.0 


444 


CTF NFI 


CTF/NF-T familv 


9 6a-977 


014 ^ 


451 


T-box 


T-box 


1 J?ft-1 17 


4ft9 (\ 


453 




Hiftske r9Fft-991 Hnmntn 


9 ^-11 


^7 7 


454 




Zinc frnoer tvnp 


1 0a-/«4 


79fi ^ 
ZZQ.J 


456 


UvUl wv %J\Jf\. 


IIUUIVUUVA UUlUOlll 




1ft 0 


459 


IP 


TmmiTnn<ylnKiilTn HAmsifn 

imiimi lUKHHiuiiii UvttliaiU 


9 /Ip-9ft 


7ft S 


460 


Hvdrolase 


haloactd dehaloffenase-like TrvdmlnQp 




06 0 

7U.7 


462 


rve 


Integrase core domain 


1.6e-13 


50.7 


466 




vxupuuiii juumuiu^y yv^ny aomaDj 


9 4a 17 


71 1 
/l.l 


467 




uupunio Boinoivgy \\^rij aornain 


O 4a IT 


71 1 
/l.l 


468 


Sterol desat 


Sterol desaturase 


7.5e-38 


139.2 


469 


nrn farvm erase 


Cvclonhilin tvne nftntiHvl-rvroK/1 cio^ 

tr 


9 6e^il 


99fl 0 


470 


Peptidase M24 


metallopeptidase femiry M24 


6e-08 


28.1 


471 


PD2 


PDZ domain (Also known as DHR or 
GLGF). 


5.4e-129 


441.9 
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SEQID 
NO: 


PF AM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


All 


myD_jLJiNA- 


Myb-like DNA-binding domain 


3.6e-0o 


33.9 


473 


77 


£#inc imgcr present in aysu upnin, k^o 


a mo 


*)A A 

zU.U 


*r/*t 


JQF 1 IJ_UUII1 a in 


CrlUUgoUUn IaCtOT i g anuria, 

conserved doma 


O^e-oo 


303.3 


47S 


JxlOOSOIIlal_l>J 1C 


IxlDOSOinal prul6IIl L/3 1c 


o.ie-oo 


ZSZ.O 


ATMs 


PI n 


€ * 1 si 4*1 AT¥t ITI 

v^iq aomaiii 


Z.jC- /j 


2.0D.I 


All 
*rf I 




OJ13 aunjain 


i.ie-iz 


55.0 


AIR 


jyiOaA__P» UX>_r IJ 


IllOan / IIIIJlj / pqqc Ialuliy 


A AAO 


ITT 
-1 /./ 


470 


FWF 


r i vc zmc linger 


Sr.3e-Zl 


tO.O 


480 




uvtrv. puxym cruse ianxiiy a 


7 Qo-ylX 

z.je-w 


J0/.4 


4R2 


dUXl oil un 


suon cnam ucnyarogenase 


l.ze-oz 


OO 1 £ 


41tt 

*tOJ 


nnlr 

CU1& 


/VTLK. repeal 


l.^e-i / 


/ 1 .y 


4X4 


JIVIO 


uiipo/ m ucd/ Sauio raniiiy 


z.ze-oJ 


OOA C 

290.5 


486 


TTP 


1 -LTV UOmalQ 


1 Oa_1Q 

j.ze-iy 


O/.o 


4R7 
to / 


FMO-liVp 

J71VlL/-lUtC 


Flavin-binding monooxygenase-like 


u 


1425.5 


4R8 
too 




T/T AX/FO Hnrnoin 

i/lwhv^ uoinain 


y.De-iui 


1A 1 A 

341.0 


495 


homeobox 


Homeobox domain 


3.6e-06 


30.8 


AQH 


pkinase 


Eukaryotic protein kinase domain 


2.3e-166 


566.1 






riDronecnn type 111 uomam 


2.5e-237 


801.8 


3U1 


T DD 

JLKJtv 


Leucine Rich Repeat 


93e-31 


115.6 


502 


RGS 


Regulator of G protein signaling 
domain 


0.041 


11.9 


503 


filament 


Intermediate filament proteins 


le-142 


487.5 


5U3 


nu 


Fibronectin type HI domain 


1.3e-100 


347.7 


<A£ 

3UO 


HIlCT 


HECT-domain (ubiquitin- 
transferase). 


le-13 


59.0 


DUY 


DiknAAmftl T *7 A 

KlDOSOmai__L/ /A 

e 


Ribosomal protein L7Ae 


5.7e-26 


99.7 


sna 




wli aomain, vj-beta repeat 


0.063 


19.8 






wu oomarny ur-oeta repeat 


0.063 


19.8 


510 


WD40 


WD domain, G»beta repeat 


2.1e-42 


154.3 


SI 1 


pionase 


Eukaryotic protein kinase domain 


2.3e-86 


300.4 


S19 
31Z 


G-garoma 


uul clomam 


1.9s-03 


34.1 


— - 




i>rU aomain 


le-06 


54.2 i 


SIS 


ri 1 Jr\_y -.ruL* 


Bacterial regulatory helix-turn-helix 
protei 


3.9e-27 


103.6 


Sin" 


Z±-V_»x.XTX 


ziiinc ringer, k^zxxl type 


i./e-34 


128.0 


S17 

31 # 




SI RNA binding domain 


o.ieoo 


205.9 


Sift 


n(nn!ici> 


jzrUKaryonc proiem jcmase Qomam 


l.o©- /J 


264.2 


525 




^ounenn aomain 


0„ OA 


ion c. 


528 




Zrfinc imgcr, \*£i~iz Type 


/1a_*7A 


OA£. A 

Z40.4 


529 




in euro u aiiMX]jixcr*gaieu Kjn~cnannci 




/5U.o 


531 


TDia/tFF 


xviiuvjiir uoiniiui 


j.oe-44 


1 £A O 
lOOU 


532 


lliyuaill llCoU 


jviyosm neaa ^moior aomam^ 


U 


14y4.5 


533 




jucucmc ivicn ivepeax 


tt ^owl S 

o.je-13 


en a 
OZ.O 


sis 
j j j 




oec/ uomain 


j.Ae-yz 


11 O 1 

319.1 


536 


homeobox 


Homeobox domain 


4.8e-05 


26.4 


337 


actiu 


Actio 


Oil- 1 n A 

2.4e-100 


330.6 


542 


ank 


Ank repeat 


L9e-35 


131J 


S44 


ZE-tAA^n 


Zmc finger Ox8-C-x5-C-x3-H type 


2.8e-10 


41.7 




DSPc 


Dual specificity phosphatase, 
catalytic doma 


2.4e-40 


147.4 


547 


HMG CoA synt 


H vdn"> YVm ethvl P li ltrrrvl *»n 7vm A 

synthas 


A 

V 


19S0 R 


549 


laminin G 


Laminin G domain 


3.3e-76 


266.6 


551 


PHD 


PHD-finger 


0.008 


9.3 


552 


PDZ 


PDZ domain (Also known as DHR or 


0.0017 


25.0 
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PFAM 
SCORE 






OlAjr ). 








Aim/ 
WW 


WW domain 


1 1a> *iA 

I.je-z4 


y5.J 


55 o 


kinesm 


Kin cs in motor domain 


1 Qa_T7X 

i.oe-i /o 


5yy. / 


< r o 

559 




Zinc linger, v^JriL/t type (JKJunvj 
finger) 


n nnr.e< 
U.UUUoO 


10.5 


563 


efhand 


EFhand 


7.9e-ll 


49.4 


50/ 


rrl 


PH domain 


/.oe-oo 


z5.y 


568 


PH 


PH domain 


3.1e-39 


143.8 


569 


Hist_aeacetyl 


Histone deacetylase family 


5.2e-106 


365.6 


570 


PDZ 


PDZ domain (Also known as DHR or 


3.4e-20 


80.5 


571 


Zt-CjHU4 


Zinc linger, C3ri(J4 type (KINO 
finger) 


le-16 


co e 
58.5 


D/J 


— r% 

uoiquitm 


Ubiquitin family 


1 A a. f\Q 

i.4e-05 


Jl.l 


5 /4 


■euro 


Formin Homology 2 Domam 


1 la. \ 1 f\ 


OQft O 

jov.y 


576 


serpm 


Serpins (serine protease inhibitors) 


4.3e-146 


496.4 


57y 




Zmc ringer, CzHz type 


c '7—, njz 

5.7e-76 


265.0 


580 


pkinase 


Eukaryotic protein kinase domain 


6.9e-79 


275.5 


581 


RhoGAP 


RhoGAP domain 


4.4e-53 


189.8 


582 


RibosomaI_L7A 
e 


Ribosomal protem L7Ae 


0.028 


1.0 


584 


kazal 


Kazal-type serine protease inhibitor 

domain 


2.2e-52 


187.4 


585 


LKR 


Leucine Rich Repeat 


4.4e-28 


106.7 


586 


PHD 


PHD-finger 


3.8e-12 


53.8 


588 


GTP1 OBG 


GTP1/OBG family 


l.le-62 


215.2 


590 


Collagen 


Collagen triple helix repeat (20 
copies) 


8e-42 


152.4 


591 


lys 


C-type lysozyme/alpha-lactalbumin 
iamijy 


1.6e-31 


116.4 


596 


AUBP 


Acyl CoA binding protein 


0.0022 


-9.4 


CCV7 

597 


oNr2_N 


SNF2 and others N-tenninal domain 


3.7e-98 


339.5 


600 


fLKArJ 


KRABbox 


i.3e-29 


1 1 1 o 

111.8 


606 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


607 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


•:uo 


V/O40 


WD domain, ^-^ *» -pat. . [ i.je-23 


i*9.i* 


610 


Cplio0_TCrl 


TCP-l/cpn60 cuat'iironin family 


1.7e-237 


802.4 


613 


THFDHGCY 
H 


Tetrahydrofolate 
dehydrogenase/cycloh; ^ro 


4.9e-173 


588.3 


617 


rrm 


RNA recognition motif. 


4e-14 


60.4 


618 


mn 


RNA recognition motit 


4e-14 


60.4 


620 


cofiim_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


3e-06 


34.2 


OZl 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.5 






Ubiquitin carboxyl-terminal 
hydrolase family 


5.8e-21 


83.1 






Zinc finger, C2H2 type 


2.5e-124 


A 


ozo 




DEAD/DEAH box helicase 


2.5e-68 


xiy.u 


AT? 


Uu 1 


Glutathione S-transferases. 


4.8e-26 


oy.u 


All 


j nucieouuase 


S'-nucleotidase 


6.6e-248 






T T\A 
J-»JJY1 


LIM domain containing proteins 


1.6e-88 


5U/.5 


/in 




Eukaryotic protein kinase domain 


1.5e-73 


Z5/.5 


Ojo 


ivLoJr aomain 


MSP (Major sperm protein) domain 


8.4e-09 


AO H 


05 y 


metalthio 


Metailothionein 


2e-24 


qa a 
y4.o 


641 




Zinc finger, C2H2 type 


6.1e-114 




642 


Ribosomal S28e 


Ribosomal protein S28e 


93e-48 


172.1 


643 


Ribosomal S5 


Ribosomal protein S5 


83e»87 


301.8 


646 


PHD 


PHD-finger 


0.00025 


23.1 


647 


WD40 


WD domain, G-beta repeat 


1.5e-22 


88.4 
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SEQID 


PFAM NAME 


DESCRIPTION 


p-volue 


PFAM 
SCORE 


64R 




jLipadw/\cyinyaroiase wiin vjUoL.- 

Iflfp mnti"f 

JLUVw U1VU1 


A A1 < 

U.UJ3 


2.2 


652 


zf-C2H2 


Zinc finger, C2H2 type 


4.1e-146 


498.8 


6V? 


nictfttif* 


Paw* hietrm** T-TO A /IHP/UQ fUA 

v*ore nisione jizA/iiz.D/iiJ/ri4 


1 Oo_1A 


AQ 0 

45.0 


UJt 


' xf-P7W? • 


vihh fmrrM 1 ^"*OTJO 4i>ha 

z*inc nnger, i^zriz type 


i .ye-o / 


1A1 A 

303.9 


655 


1 OO 


Doe ramilu 
IvHS laiiilly 


0.4©-/ / 


")CC% A 


6 57 




z*mc iiiiger, wjxiL/4 type (tvuNvj 
ungcrj 




AH A 

40.4 


658 


STphosphatase 


Ser/Thr protein phosphatase 


2.6e-182 


619.1 


65Q 




zauc linger, czriz rype 


1 . je-yz 


OO 1 1 

321.1 


660 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-85 


297.6 


CC r ) 
OOZ 




Nucleoside diphosphate kinases 


1.4e-l 19 


410.7 


CCA 

004 


TOP 


Interferon regulatory factor 
transcription f 


7e-20 


79.5 


665 




4-hydroxyphenylpyruvate 
uio xygenase o icxiu 


1 .4e-l o 


co e 
00.5 






F^P A n/TYP A U Kav Via1*r>acA 

us2/r\u/ um\ri oox neucase 


4.oe- /4 


O^O 1 
Zj /.l 


ccn 
wt 




uhajlj/ jjjiaxi oox neucase 


z.ye-/u 


oo< 1 

ZZO.l 


66Q 




Bukaryotic protein kinase domain 




OOO o 


U/ 1 


I1U1UGU L7UA 


fiomeoDox ipmpin 


A A1 ft 


1 A < 
10.0 


678 


crystal] 


Beta/Gamma crystallin 


4.7e-106 


365.8 


o/y 


\vrr\Af\ 
WD4U 


wu comaio, u-beta repeat 


1.9e-06 


34.9 


OoU 


JvcTaiin_r>x 


Keratin, nign suitur B2 protein 


4.1e-06 


15.9 


682 


G- gamma 


GGL domain 


8.5e-33 


117.9 


0&5 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


1.4e-29 


111.7 


OoO 


A jtJLt--1ll.iJ-.Ll.Jlf 

Acetyitransi 


Acetyltransferase (GNAT) family 


6.6 e- 10 


46.4 


Oo/ 


/tm_i 


7 transmembrane receptor (rhodopsin 
femiry) 


4.6e-15 


50.0 


6RR 

OOO 


pruicosome 


Proteasome A-type and H-type 


C f a CA 


OOC T 

225.7 


007 


OV^jTZ 


oL/r-z sterol transter tamiiy 


o.2e-37 


136.1 


690 


TS-N 


TS-N domain 


0.041 


20.1 


oyz 




zmc ringer, C2H2 type 


9.9e-60 


211.9 




ZI-JVli INL/ 


mijnij imger 


0.03o 


5.5 




iJxysteroi^jDjr 


Oxysterol-binding prctein 


*3 A* 1 OO 

3.9e-133 


455.7 

"lis J - 


69i 


PD;: 


FDZ domain (Also known as DHR c; 
UiAjr J. 


\3e-30 " 




i'epuCuC^ _Uz 


Calpain family cysteine protease 


23e-175 


Ct\£. A 

596.0 


/uo 


filament •. 


Intermediate filament proteins 


7»2e-107 


368.5 


/1U 


fibrinogen__C 


Fibrinogen beta and gamma chains, 
C-term 


OA 


278.0 


711 


SH2 


Src homology domain 2 


2.3e-65 


192.1 


717 
/ 1Z 




ATP synthase, Delta/Epsilon chain 


U.UOUoz 


1A A 

ly.o 


71^ 


APTTb 
jfUviL/ 


Ai\jjL> l/xna Dmoing uomam 


ze-i / 




71A 




LfOr / DJri / Kstsi r iamiiy 


o.oe-34 


io< o 
125.7 


71 5 


pm nnl T 


-kina polymerases \-> / 1 j to 10 kl/3 

OUULUlil 


4.o©-4V 


1 /03 


716 


ICR ATI 


FTJAD 


1 lex-A*) 

i.je-4z 


1 ft 


717 


llll W Will 


Anltrt/* h ftn /in al pqiti nrnfatni! 

iTii LUvii uiiur loi woiTicr proienis 






719 


Gal-bind lectin 


Vertebrate galactoside-binding lectin 


1.5e-25 


902 


776 
/zo 


aiueun 


Aldehyde dehydrogenase family 


1 o« t 1 o 

i.3e-i iy 


/l 1 A 0 


728 


Grycos_transf_2 


Glycosyl transferases 


4e-21 


83.6 


/J4 




hlmz aomam 


2e-34 


127.0 


/ jj 


nn<e 

rxOj 


Protein phosphatase 2A regulatory 


0 


1 AO O 1 

1038.2 


737 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4e-14 


60.4 


740 


WD40 


WD domain, G-beta repeat 


5.6e-14 


59.9 


745 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 


3.8e-13 


46.9 
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NO: 


rrAM nAJYLE 


JJli/oL.ivLr l xKJri 


p-value 


PFAM 












749 


mi to carr 


Mitochondrial carrier nroteins 






750 


DUF27 


Domain of unknown function Dt fF77 






751 


SH3 


SH3 domain 




7fi S 
/U.J 


752 


HMG box 


HMG (high mobility group) box 


8.6e-13 


55.9 


753 


SPRY 


SPRY domain 


j.ye-io 


7*1 1 


754 


GTP CDC 


{"Jell riivi<rinn nrnfern 


7 


<71 "7 


755 


UUiV wall 


A/f itnr'linnflrinl carrtPT nrnfptnc 


Qa_O0 

oe-ao 




756 


TSPN 


JlUIUillUUapUilUill Ivl llllllrtl — 1UV.C 

domains 


o.ieoo 




757 


BTB 


BTB/POZ domain 


J. /C"ZJ 


fiO 7 

07. f 


759 


zf-C2H2 ' 


Zinc fin per C2H2 tvne 


1 7*»-17 




760 


NSF 


NSK attachment nrotem 


fk 4p-177 

w.*tC~ 1 L, / 


41^ 1 


762 


Ribosomal SI 4 


Ribosomal protein S14p/S29e 


2.1e-06 


24.8 


765 

/UJ 


l jilt loiimy 


J lTiiP fnmilv 


i 7« ao 

i./e-jy 


144.6 


766 


DnaJ 


JJIltU UvJilolll 


j. ye- jo 


ill c 

133.5 


768 


tRNA-svnt ?h 


tRTJA cvnth^tncA place TT 


O 1a 01 

y.ie-oi 




769 




l»vj w -ticiiDi iy iipupr uicm recepror 
domain 


A 

u 


1404.5 


770 




vyjl/ mjiHoujj \j~Dcui repeal 


ze-zi 


OA C 

84.0 


771 


LRR 


Leucine Rich Repeat 


3.8e-06 


33.9 


ha 


QXJTJ7 XT 


C?XX^^"J am/I A+HA^n llT tmiinji. I --h A 1 -1 * 

oiNrz anc oiners iN-ienniiiai domain 


5.5e-99 


3423 


776 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


777 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115,4 


/ /o 


Vroy - 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


/ /y 




Zmc linger, type (RING 
finger) 


3.1e-08 


31.0 


781 


cadherin 


Cadherin domain 


5.6e-113 


388.7 


7Jtt 

/ Oj 


HECT 


HECT-domain (ubiquitin- 
transferase). 


4^e-31 


116.8 


785 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


21 43 


786 


sushi | Sushi domchi (SCR repeat) ^ . 


1.8e^j* 


214.3 




vwa .: vcs WiUebraua factor type A C'.cnsbj \9a-52 : ib*7.7 


790 


rrm 


RNA recognition motif. 


2.8e-20 


80.8 


791 


Collagen 


Collagen triple helix repeat (20 
copies) 


0.00097 


9.7 


792 


pkinase 


Eukaryotic protein kinase domain 


0.023 


12.4 


795 


zf-C2H2 


Zinc finger, C2H2 type 


6Je-95 


328.7 


796 


adh short 


short chain dehydrogenase 


4.1e-05 


-7.3 


799 


SAICAR synt 


SAICAR synthetase 


6e-125 


428.5 


805 


WD40 


WD domain, G-beta repeat 


4e-65 


229.8 


806 


ZU5 


ZU5 domain 


4.7e-37 


136.5 


807 


WD40 


WD domain, G-beta repeat 


0.016 


21.8 


808 


WD40 


WD domain, G-beta repeat 


0.0041 


23.8 


809 


pkinase 


Eukaryotic protein kinase domain 


2e-31 


117.2 


810 


vwa 


von WOlebrand factor type A domain 


1.9e-52 


187.7 


814 


zf~C2H2 


Zinc finger, C2H2 type 


4.5e-83 


289.4 


815 


zf-C2H2 


Zinc finger, C2H2 type 


6e-74 


259.1 


817 


myosinjiead 


Myosin head (motor domain) 


1.5e-176 


599.9 


818 


GSPII_E 


Bacterial type II secretion system 
protein 


0.012 


11.5 


819 


PDEase 


3'5'-cyclic nucleotide 
phosphodiesterase 


l.le-74 


215.5 


821 


PH 


PH domain 


0.00025 


20.5 


822 


CNH 


Cl^ domain 


0.00015 


-24.7 


827 


i'ioi 


RNA recognition motif. 


1.5e-06 


35.2 
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SEQ ID 

ri\J. 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 


R90 


UMfi box 


T-JK4fr fhioh mnhilitv omimMt/w 




R 


can 

OJv 




UnQftFT? Hnmain 
ivoauj^r uuiiiaiu 


z.ze- 1 


J J j.j 


OJ 1 


CNH 


f^NN rfnmain 


je-i is 








K^TtftphftTinrijiI f*j*t*ripr «mt*»ine 
ivxjtvwuuutuxcu wcuiiwi pruLCuio 


j./e-j / 


IjuJ 




PV 

I A 


PY Hnmnin 


^./e-iy 


/ f.J 


837 


Y_phosphatase 


Protein-tyrosine phosphatase 


1.6e-263 


888.8 


Oj5 




An If fpnpof 
/AHA. repeal 




Ol 1 < 






/\J1K repeal 


j.oe-3o 


iJy.o 


842 


Ribosomal L15e 


Ribosomal LIS 


4.8e-131 


448.8 


o4J 




Sodium:neurotraflSinitter symporter 
iainiiy 


0 


iom o 

1201.8 




r cpilua5G__iYl 1 0 


insuunase ^repiiuase iamiiy m i 


4.7e-o7 




IMS 


isr luu 


EF-1 guanine nucleotide exchange 

QUIIlaiil 




200.7 


849 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-122 


420.5 


OJU 




z,mc ringer, v^zriz rype 


ze-o7 


Mil A 


RV? 


Old 


oio *uoniain 


3.oe-Ju 


ill iC 

1 li.o 






jKXioo/vr aomam 


1- 10-37 


138.0 






puz. domain ^aiso Known as DriK or 
GLGF). 


f la 1A 

j.ie-iu 


4o.7 


ODD 




Acyi-CoA oxidase 


9.1e-263 


886.3 


©CQ 
OJO 


efhand 


cr nana 


2.4e-18 


74.4 


860 


homeobox 


Homeobox domain 


4e-22 


86.9 


862 


' 1 'tf 1 V 1J Lain 

Trllr_oeta 


Transcription initiation factor IIF, 
beta 


2^e-l34 


459.8 


oOO 




Alpha-2-macroglobulin family 


4.9e-2l 


70.9 


867 


MoCFJuosynth 


Molybdenum cofactor biosynthesis 
protei 


5.8e-205 


694.3 


868 


EGF 


EGF-like domain 


4.le-22 


86.9 


869 


EGF 


EGF-like domain 


Lle-22 


88.8 


OT 1 


PI-PLC-X 


PhosphatidyLinositol-specific 
phospholipase 


7.2e-95 


328.6 


O/Z 




Ubiquitin carboxyl-terminal 
hydrolase family 


l.le-20 


82.1 


874 


SH3 


SH3 domain 


2.2e-14 


61J 


Or/ 




: . *:f 3 domac: 




3ii. / 


oOZ 


fp A Q 


JSJvArJ DOX 


^ ft— a.. 
o.ye-'T^ 


162.6 


885 


ank 


Ank repeat 


7.1e^)7 


363 


OOO 


biopterin_H 


Biopterin-dependent aromatic amino 
acid h 


0 


988.3 


OQ/7 


/1TD LI It" 1 '1 T 

urii* Jar 1U 


nionganon iactor 1 u ramiiy 


if r*_ 1 0ft 

4.9e-129 


437.5 


RRR 

OOO 




z«inc nnger, cjri(j4 type (kinu 
finger) 


l.oe-14 


51.4 


RRQ 

007 




Z/Uiu imger, lziiz type 


3./e-y2 


Jly.o 


890 




Immunoglobulin domain 


3.8e-06 


24.8 




Jr 1 JV& 


ir\Ji iamiiy 


y.De-4o 


1 1Z'S ft 

163.0 


oyj 


oUumaSc 


oUlTaXaSc 


H Ca to 

3.50-75 


273 .2 


894 


Sulfatase 


Sulfatase 


3.5e-78 


2732 




/ini_i 


7 transmembrane receptor (rhodopsin 
family) 


4.5e-51 


164.4 


QQ£L 

oyo 


^lirnn 1. ,1/1-., O 1 

ijiyco_nyaro_3 1 


Glycosyl hydrolases family 31 


0 


1277.3 


oy/ 


chromo 


'chromo* (CHRromatin Organization 
M.uuiner^ 


3.9e-06 


o^ /\ 
26.0 


ROR 

oyo 


no m 
CDl__lN 


CBL proto-oncogene N-tenninal 
domain 


l^e-273 


y22.4 


899 


vwa 


von Willebrand factor type A domain 


5.5e-32 


119.7 


900 


WD40 


WD domain, G-beta repeat 


2.7e-07 


37.7 


901 


zf-C2H2 


Zinc ringer, C2H2 type 


4e-156 


532.1 


903 


ras 


Ras family 


6.6e-101 1 


348.6 
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aEQ ID 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
oCUKE 


904 


ArmaHillrt qp<» 


ArmarftHo/heta-cfltenfn-like rprvpatc 


1 1p-A6 


K 6 


906 


FH9 


Arm in Hnmnlnov 9 Domain 

X VJ 1LLLLI XXUillUlUgJr *~ LnJLUfUU 




Ifi^ 97 
JOJ. / 


907 


^/Jr UUy ijr iU culdX 


r^vHH vlv ftrnn efi»rft qp 
wy uuy ty iu cuioicii cue 


i 4p_n^ 


90 


908 


■nlffnnop 


Piilrarvririf nmtpin Vrnncp Hfvmnin 


1 9<»-/vd 


99R 9 


909 




T^iiVnrvntir nrmtptn If in nop riftmntn 

CUIUU YUUW Ul UICUI MllflirV UUlildlll 


oJc~/y 


94^ ^ 
Z*f J.J 


910 

■/IV 


nViTinop 


TTiilraTvnric rrmtpin Irinncp Hfwnain 
UrUi\xu y \7 u.w uiuicuj mi lnftv uwikiui 


9 0p-49 




911 


|.' PL II ififV 


PiiVflrvntiP nmtpin Irinncp. rfmnnin 

JuUfilUjUUv JLUUlCiil AILHhW UtAUQUl 


1 9p*^^ 


111 R 


919 


PHD 

X ILLS 




J.1C-VK) 




on 


pun 

1 1X17 


PT47\_fm o Pr 


J.JC"10 


00.3 


916 


fil am Mrf 


Trif'prm i atp filampnt nrntPlTic 

Hi Id JJUCUXOXC iXIolllvllv piUiwlUA 


0 7a_191 




917 


T TM 


T WA Hnrnnin /vMitftfnino nrntpiriQ 
JUJJ.Y1 ui/ixiaiu wuiim u i nig piuicuiA) 


c Op-1 ^ 
j.y©-i j 


^9 O 


918 


SAM 


SAM domain (Sterile alpha motif) 


4.3e-16 


66.9 


099 


A /*U!nilAn\nQtQCA 

/VLyipnuspooiaoc 


Acyipno^iiaiaSc 


9 Qa_£? 


111 JC 


094 

yZH 


IS 


Iniiiiunoglobulin domain 


i.je-Uo 


32.6 


go c 

yzj 


ACyi-V^QA^CUl 


f\cyi-\Aji\ aenyurugcnaSc 


9 /4rk_1 n 


A A A 0 


927 


7tm_l 


7 transmembrane receptor (rhodopsin 
nunuyj 


2.9e-45 


145.9 




gioom 


vjiodih 


9 

-Z.4e-^z 


1 Of A 

loo.y 


090 
yzy 


5Ugar_u 


ougor (ana oiner/ Dansponer 


1 9*» IJC 


£0 O 

oo.o 


019 

yjz 


coiiagen 


uouagen triple nenx repeat ^zu 

AAH1 

copies,/ 


A AAAIVT 

0.00097 


9.7 


Oil 


UMO Kay 

fliVlvJ l/U A. 


Jnuvivj ^nign mODiniy group j oox 




toe o 
125.0 


014 


OCA 


oIjA Qomatn 




*)A n 
24.7 


01^ 


IBS 


i\as ramiry 


o.4eoy 


OAA O 

209^ 


016 
yjo 




waipomn nomoiogy ^Uxi^ domain 


j.oe-zi 


83.7 


017 




voiiage galea cnionae cnanneis 


1 On 1 AA 


676.0 


01 R 

7JO 


nonicuDOX 


riomeoDOx uomam 


l.ye-Z5 


AO A 

yo.O 


04fl 


v\lr 1T1 OCA 


Cukaryotic protein kinase domain 


Q Oa_CO 

y.ye-35 


one o 

205.2 


049 
y**Z 


li^^f /NO «T1 +4 1 1 

iYiyosin_uiii 


Myosin tail 


i a« no 

j./e-uy 


Q O O 

38.2 


041 
ynj 


■rf-P9H9 


zjnc ringer, uzrt£ type 




320.3 


04 S 




Clathnn adaptor complex small chain 


1.3e-7o 


268.0 


046 


SUgaT_Tr 


Sugar (and other) transporter 


A Al*7 
0.01 / 


1 io o 

-122.8 


947 


tRNA-synt_le 


tRNA synthetases class I (C) 


0.00097 


15.6 


y4o 


x rlL/ 


rriu-imger 


2.2e-17 


71.2 


0*1 




Sugar fand other) tianspf^tr; 


A AAOO 

0.00?9 ^ 


-113 -i 1 






rv4ito:;hondv\ : / . ><.rrierproi\*uis 




r ' 1 OA * 

139./ 


953 


mvjK TYKI A 
TuyU_LtLHJ\- 


Myb-like DriA-bindiiig domain 


4.5e-2v j 


OA 1 

80.1 


t 9<e, 




xjcta-Keioacyi synioase 


7 1* too ( 




957 


nlrln IrPt mH 


/vitnv itcio reauciasc Xamuy 


i .De-yo 




959 


Kelch 


Kelch motif 


0.02 


20.8 


061 


IBS 


ivas iarniiy 


O O a OA 


iii i 
111.1 


yon 


UUIllcOUOX 


xiomeoDox uomam 


5.4e-2z 


o£ e 

86.5 


96^ 


JTxl 


rxx qo main 


Je-zl 


OA A 

80.9 


066 


^f-PIWPA 

ZI-V^JXIn./H 


zjnc ringer, LoxiU4 type ^kino 
unger^ 


1 Oa AO 

z^£e-uy 


ia n 

34.7 


967 


Pfhnonmfll T 9Q 


INJUUSUlilal Lt£y pTOlcm 


l.Oc-lj 


03.0 


970 


FAD_binding_2 


FAD binding domain 


8.9e^7 


166.6 


071 
7/1 


rve 


luicgrase core uomam 


A AAA1 < 


1 A O 

19.5 


079 


UiytAra^uallSI Z 


\jiycosyi uoiisierases 


z.xe-zi 


B4.3 


974 


IVLuOSOIUal LtlXJ 


ixioosomai protein ijiu 


j.ie-4o 


1 13,0 


7 f J 


Htm 1 


7 transmembrane receptor (rhodopsin 

lcuiiuy^ 


i ^a 

l.oe-37 


121.3 


07£ 


ZI-U4 


£mc tmger, C4 type (two domains) 


2.1e-52 


178.5 


977 


2f-C2H2 


Zinc ringer, C2H2 type 


6.6e-150 


511.4 


978 


FTHFS 


Formate— tetrahydrofolate ligase 


0 


1367.2 


982 


Renal_dipeptase 


Renal dip'eptidase 


1.3e-73 


258.0 


984 


A deaminase 


Adenosine/AMP deaminase 


2.6e-05 


-48.6 
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TABLES 



SEQID NO: 
of full-length 
nucleotide 
sequence 


SEQID 
NO: of 
full-length 
peptide 
sequence 


SEQID NO: 
of con tig 
nucleotide 
sequence 


SEQID NO: 
ofcontig 
peptide 
sequence 


Priority docket 
number_correspondui 
g SEQID NO: in 
priority application 


SEQID NO: in 
U.S.S.N. 09/496,914 


I 


OJK 

70 J 


1060 

1707 


70*,1 


787Pn>7 1 


1 jU 


7 


700 


1 070 




7Q7f , TT>7 7 




i 
j 


70/ 


1071 

17/1 


70*. *. 


7fl7f , TD7 1 


1884 


A 
f 


08R 
705 


1 077 


70*.a 
zyjo 




OIOI 

2123 


K 
J 


OflO 
707 


1071 
17/ J 


7Q<*7 
J.yj/ 




2313 


O 


QDA 
77U 




zyjo 


787L.IP2 6 


3284 


/ 


77l 


1 QTC 

17/J 


zyay 


787CIr2 7 


3324 


o 
O 


0O7 
77Z 


1 (VIA. 

iy/o 


/you 


7o7CJLr2 8 


XI oo 

6182 


O 


OQ1 

yyo 


17/ / 


2961 


TO TO "TOO A 

787C1P2 9 


6210 


1 ft 


QQA 
77*» 


1 0*7©. 

iy/o 


2962 


787CIP2_10 


6213 


1 i 
1 1 


773 


iy/y 


OOiZI 

zyoi 


TOT/TOO t 1 

787C1P2 11 


6257 


17 


770 


1 QOA 

IVoU 


2954 


787C1P2_12 


6294 


I J 


qot 
yyv 


lyoi 


2965 


787CIP2 13 


6294 


1*1 


QQO 
770 


1982 


Oft// 

2966 


787C1P2_14 


6330 


ID 


ooo 

777 


1 DOT. 

1983 


2967 


787CIP2 15 


6364 


lo 


i An A 

1000 


1984 


2968 


787CIP2_16 


6455 


17 


1 AA1 
1001 


1985 


2969 


787CIP2 17 


6486 


1 0 

lo 


1 AAO 

1002 


1986 


2970 


787CIP2 18 


6503 


1 A 

19 


1003 


1987 


2971 


787CIP2 19 


6528 


OA 


1 AAjI 

1004 


1988 


2972 


787CIP2 20 


6572 


O 1 


1 AAC 

1005 


1989 


2973 


787CBP2 21 


6578 


12. 


1006 


1990 


2974 


787CEP2 22 


6593 


23 


1007 


1991 


2975 


787CIP2 23 


6603 


24 


1008 


1992 


2976 


787CIP2 24 


6603 


25 


1009 


1 AA1 

1993 


2977 


787CIP2 25 


6679 


26 


1010 


1 C\CkA 

iyy4 


onto 

2978 


787CIP2 26 


6744 


27 


1011 


1 QflC 

lyyo 


2979 


787CIP2 27 


6762 


28 


1012 


1 OA£ 

iyyo 


OAOA 


787CIP2 28 


6770 


29 


1013 


1 QO*7 
177/ 


ono i 


787CLP2_7.9 


6770 


-*3u j 1014 


li*r?T\ 


zy^.v 


/87CIP2 30 




ji 


1015 


1 OAT*. ' 

iyyy 


2983 


787CIP2 31 


6aj8 


32 


1016 




OOQjI 


787CIP2 32 


6866 


33 


1017 


OAA1 


OQOC 

, lyoo 


787C1P2 33 


6938 


34 


1018 




OQQ£ 


787CIP2 34 


6938 


35 


1019 


OAA1 
ZUUo 


Z707 


787C1P2 35 


6977 


36 


1020 




zyoo 


/o/L-lrZ Jo 


TAA1 
7001 


37 


1021 




OOQQ 

zyoy 


787Ctr2 37 


TAAO 

7002 


38 


1022 




zyyu 


/o/Clrz JO 


7004 


39 


1023 


7O07 


7001 

zyyi 


/o/L»lr]£ iy 


7005 


40 


1024 




70Q7 

/yyz 


fo/K^urZ 40 


TAAX 

7006 


41 


1025 


7000 

A WW 7 


7001 


7»7rTD7 A1 
/0/L*UrZ 41 


/UU5 


42 


1026 


701 n 


7004 


7Q7r*TD7 A7 


*7A1 A 

/U14 


43 


1027 


701 1 


7Q0< 

/yyj 


TQ7r*Tiy7 /(I 




44 


1028 


7017 


700£ 
zyyo 


/o/^Lrz 44 


/022 


45 


1029 


701 1 


zyy/ 


/IX 

/o/CiKi 46 


TACT 

7057 


46 


1030 


2014 


2998 


787r7P2 47 


70 SJt 


47 


1031 


2015 


2999 


787CIP2 49 


7088 


48 


1032 


2016 


3000 


787CIP2 50 


7089 


49 


1033 


2017 


3001 


787CIP2 51 


7182 


50 


1034 


2018 


3002 


787CIP2 52 


7489 


51 


1035 


2019 


3003 


787CIP2 53 


7564 


52 


1036 


2020 


3004 


787CIP2 54 


7566 


53 


1037 


2021 


3005 


787CIP2 55 


7587 
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54 


1038 


2022 


3006 


787CIP2 56 


7591 

/ J7 1 


55 


1039 


2023 


3007 


787C1P2 57 


7600 


56 


1040 


2024 


3008 


787CIP2 58 

r W f X^AA A* •/ V 


7604 


57 


1041 


2025 


3009 


787CEP2 59 


7612 


58 


1042 


2026 


3010 


787CIP2 60 


7613 


59 


1043 


2027 


3011 


787CIP2 61 


7615 


60 


1044 


2028 


3012 


787CIP2 62 


7616 


61 


1045 


2029 


3013 


787CIP2 63 


7617 


62 


1046 


2030 


3014 


787CIP2 64 


7623 


63 


1047 


2031 


3015 


787CIP2 65 


7625 


64 


1048 


2032 


3016 


787CIP2 66 


7625 


65 


1049 


2033 


3017 


787CTP2 67 


7630 


66 


1050 


2034 


3018 


787CIP2 68 

tot A uo 


7638 


67 


1051 


2035 


3019 


787C1P2 69 

/Of a» U7 


7640 


68 


1052 


2036 


3020 


787CTP7 70 


7670 
/o /u 


69 


1053 


2037 


3021 


787CTP7 71 


7676 


70 


1054 


2038 


3022 


787P7P7 77 

/Of y^Xx^t / £, 


7688 
/uoo 


71 


1055 


2039 


3023 


787CIP2 73 

/Ot vil X> f J 


7690 


72 


1056 


2040 


3024 


7R7PTP2 74 


7700 


73 


1057 


2041 


3025 


787PTP7 75 

/Of \~>XX£, 1 J 


7774 


74 


1058 


2042 


3026 


787CTP7 76 

/Of vvaJT a* 1 U 


77R4 
/ /o*t 


75 


1059 


9043 


1077 


787PTP7 77 


77R5 

/ / OJ 


76 


x vxjyj 


1044 

4WTI 


1078 


7R7PTP7 7ft 


7707 


77 


Ivvl 


9045 


1070 


7R7PTP7 70 


770R 
/ fyo 


78 


1062 




1010 


7R7PTP7 RO 


7R07 


79 


1061 


7047 


1011 


7R7PTP7 R1 


*7JM0 
to l\J 


80 


1064 


7048 


1017 


7R7PTP7 SO 


7R17 
to Id, 


81 


1065 


7049 


1011 


7R7PTP7 R1 


7R16 


82 


1066 


7050 


1014 


7R7PTP7 ft4 


7R76 


83 


1067 


7051 


101 S 


7R7PTP7 R5 
f o / v^ir av o j 


7R47 


84 


IvUD 


7057 


1016 


7R7PTP7 Rfi 




O-/ 


1069 

JVU7 


7051 


1017 


7ft7r t TP7 R7 


/OOj 


86 


1070 


7054 


101R 
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TABLE 6 



S£QID 

NO: 


Method 


Predicted 

DueJeoike 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
iocaiioft 
corresponding . 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AJanine OCystdne, r>=>Aspartic Add, 
?*=G*trtaicic t*l ^-Phepvialanine, G ^Wdie. P=Histid]ne. 

* ^ndut, K- Lys; ^: .» L -Leudne, JVj»M > .UUo *> ine, 
.N^Asporagine, r^Proline, Q-Glatamine, RaArginiDe, S - • uine, 
T=Threonine > V»Valine, W=Tryptophan, Y=»Tyrosine, 
A^l v !aiown, *>=>Stop codon, A^possibie nucleotide deletion, 
Vtki^;^ nadeotlde Insertion 


2953 


A 


3 


324 


ISEHRIEASGNYlAQRLTSSrXRGLSSWKSr^LML 
CGWTIIXTLTN1VQGEP*GP\KGIPG\FHTNSSYPH 
WGTVAKPPAGD*DLLPAPGQEGTFLFTR*SLCTY 
CPID 


2954 


A 


18 


467 


REELGKDLFDCTLYVLLKYDDFNADKHLALEEF 

YRAFQVIQI^LPEDQKLSITAATVGQSAVLSCAIQ 

GTIJRPPnWKRNNinJWLDLEDINDFGDIXjS 

KVTTTHVOfrJYTCYAIXj 

IRVYPESQARRAG 


2955 


A 


3 


23 


FYSAFLVADKGIVTSKHNNDTQHIWESDSNEFSV 
IADPRGNTLGRGTTIT*VSIPPSL 


2956 


A 


1 


493 


RTKTDVYILNLAVADLLLLFTLPFWAVNAVHGW 

VLGKJMCKITSALYTLNFVSGMQFLACISIDRYV 

AVTKVPSQSGVGKPCWnCFCVWMAAEJLSIPQL 

VFYTVNDNARCIPIFPRYLGTSMKALIQMLEICIG 

FWPFLIMGVCYHTARTIJvlKMPNIKIS 


2957 


A 


703 


302 


EETGVREKRRERMKEKMWQNVLCCTLQTAV1L 

KI^QNKVLMLKNFIO^PIJDTRKNKV^ 

PGAVAHACNPSTLGGRGGRTTKSGDRDHPGQHG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«AJaniiw OCystrine, I>=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, OGIydne, H«Histidlne, 
f=Isoleudne, K^Lysine, L=Leudne, M^Methlonine, 
N=Asparagine, F^prollne, Q^Glutamine, R-Arginine, S=*$erine, 
T«Threonine, V«Vallne, W«=TryptDphan, Y-Tyrosine, 
X^UnknowD, *™Stop eodon, /=possible nudeotide deletion, 
V*pos$ible nudeotide insertion 










ETRSIJACWAQWKSlJU^VSRAPGRQGSLVVFP 
LP 




A 

A 


3 / J 


11/34 


U 1 IvUlvAJJCJJ 1 Cr NJsjNr U 1 KLKour Y LnlAjKL^LU 
NCPEGLEANNHTMECVSIVHCEVSEWNPWSPCT 
KKGKTCGFKRGTETRVREEIQHPSAKGNLCPPTN 
ETRKCTV QRKKCQKGERGKKGRERKRKKPNKG 
ESKEAIPDSKSLESSKEEPEQRENKQQQ 




A 


1 
1 




T AT OTTO'l'L'tTPT C\7T n/PTnTVO/^TJOTlTUT O AtH/Tri 

LSMl^TISTEiiRLSVLWri^ 
VLLWALSLLQSILEWMFCSFLFSDVDSDNWCQIL 
DFLTAVWLIFLIXLVlXZGFTLVTLVRnCGSQKMPL 
TRLYVTIIXTGLVFLFCSLPLSIQ*FLLYWIEKDLD 

TNT 


2960 


A 


1194 


852 


EKRKTSYSQCLNSKQRNVSMRPSIWIHVHLKPPC 
RLVELIJ>FSSAL(^1^HLSLGTTLPA^*GHLRFRL 
RNLPQSLRTVILPERNEEQNLQELSHNADKYQM 
GDCCKEEIDDSIFV 


2961 


A 


274 


2250 


EKGKVKDAGAEQW1SLSLSCKGSWETQFSNHLN 

SLTPmVRRMPLITTVTLlJCMVAJRHHM^ 

AFSTQLQQKIFLHSQMGIHHQSVCMKLKPNTSHn 

SILMGQPMALVQLETLAPLTinQKFQTQDHMKF 

WKNLPLHSHHLTPSVPQTVIPKXTGSPEIKLKITK 

TIQNGRELFESSLCGDLLNEVQASE\Q*NQSIESRK 

EKRKKSNKHDSSRSEERKSHKIPKLEPEEQNRPN 

ERVDTVSEKPREEPVLKEGSPSSANTEFCSNNGSV 

HAV\FKFQVGDLVWSKVGTYPWWPCMVSSDPQL 

EVrHXINTRGAREYHVQFFSNQPERAWVHEKRV 

REYKGHKQYEELLAEATKQASNHSEKQKIRKPR 

PQRERAQWDIGIAHAEKALKMTREERIEQYTFIYI 

DKQPEEALSQAKXSVASKTEVKKTRRPRSVLNT 

QPEQTNAGEVASSLSSTEERRHSQRRHTSAEEEEP 

PPVKLVWKTAAARi" rrPASTTte^OCG?? P* T'"N 

MaP v v .; >.'_?.•(< WALQ.TATGIXjKFIDQi v * o G 

NKTEISVRGQDRLnSTrWQRNEK^^ 

GSTGSVEKKC^RRSIRTRSESEKSTEVVPiwia^C 

KEQVETVPQATVKTGLQKGSADRGVQGSVRrSD 

SSVSAAIEETVD 


2962. 


A 


2408 


836 


SASPPPPPPPPPSRFPFSGAPGARDRSGPLGSEPQR 

NPGARPRTLEATVTPPGSVGAMSSSGLNSEKVA 

AUQKLNSDPQFV1AQNVGTIHDLLDICLKRATV 

QRAQHVFQHAVPQEGKPITNQKSSGRCWIFSCLN 

VMRLPFMKiai^EFEFSQ 

1JSAFVDTAQRKEPEIX3RLVQFLLMNPANDGGQ 

WDMLVNIVEKYGVIPKKCFPESYTTEATRRMND 

ILNHKMREFCIRLRNLVHSGATKGEISATQDVM 

MEEIFRWCICLGNPPETrTWEYRDKDKNNK^ 

PUTPLErl^R/EQHVKPLFNMEDKICLVNDPRPQH 

KYNKLYTV\EYL\SNMVWRGEKLFYNNQPK)FLK 

KMVAASIKDGVEA V wr UCJJ V 

MNLYDHELWGVSLKNMNKAER\LTFGES\LMT 

HTMTFTAV/SQSRDDSGMVLFTKW\RVGEFQWG 

VP\EEVLAVLGAGNPFVLPAWDPMGALAE 


2963 


A 


90 


543 


RHYDSAGKTTLKIAKNYLEQRAVGGASPRLAQS 
VLTCSREPILENSLTSLIEYLHNALEHDMRLRFNN 
DRMKTTIKETST* LSNS YLVFPLM* SLTYLMKMS 
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SEQXD 
NO: 


Merhod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted cod 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AlanJnc 0=»Cystdne, D=Aspartic Add, 
E-Glutamk Add, ^Phenylalanine, G=Cr/rioe, H-HIstidlne, 
I=Iso!eudne, K=Lysine, l^Leudoe, M^Methlonlne, 
N=Asparagine, MYoline, Q=G)utamine, R»Arglnine, &=Serine, 
T^Threonine, V«Valine, W»Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, ^possible nndeotide ddetfoo, 
V=possibie nndeotide insertion 










FERCTARNKMFVNSPFTKVDNYC^SXWKKFn^ 
KCYFSLNTBKKEKKMT 


2964 


A 


3 


2454 


FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITIVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESff 

WKNAKEKEVPLEEEMUQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNIEPKDVP/IACASA'GFLP^ 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRIUiIJCEQNLS\VKVIFFQGAV , nVF 

NVNAPLPPRKEQEDCESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PVDVF\TI>IPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYVFIFHMLKLAVNWLYVNLMKNEEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL 

HRGAIYGSSW 


2965 


A 


3 


2454 


FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKH1JK1DLLSKIXNSGYFESIP' 

VP- r NAKPKT.vrLEEEN:; *Q^FKKTQLSKTE3VKE 

SESLME1 < ? QEFLNRR YMTEVDYSNKQGE 

EQPWEAD7 ::RKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGICHOEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISK£Ka>SPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTnCSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASIJPNIX}LLPR 

KLNTEPKDW/IACASA*GFLPLQPPFRRI/rIVLRK 

EKLQDLMTQIC^TCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGA\mVF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSfflVEQTVHSQETANYHPDGTIQVSNGS 

IJ^FYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGWSDSSQVSSPERDNETFNSGDSGQGDSRSMT 

PVDVPVTNPAATILPA^HVYPIJPQQMRVAFSAAR 

TSNLAPGTLIXJPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYVFIFHMLKIj^VNVPLYVNLMrO^ 

V^AYAMTV^APriRTTTAQKrHATT Cil FrWrTVYTCl/T PT 
V OjTX J /\1N1-Aj/Vr JL/xlCi 1 f\ DIN XTLr\_LL» J_lT y^JUK^X YfLiSxXu 

HRGAIYGSSW 


2966 


A 


1693 


227 


DYVLTAELHRQRSPGVSFGLSVFNLMNAIMGSGI 

IXjIAYVMANTGWGFSFLLLTVALLASYSVHLL 

LSMCIQTAY1XjP*TNYFMVLPAH*LTCLPLIEFLQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystdne, D=Aspartfe Add, 
E~Glutamic Add, ^Phenylalanine, G=Gtydne, H-HUtidine, 
l=fooleudoe, K=Lydne, L^Leudne, M^Metbkmine, 
N=*Asparagtne, P=ProUne, Q^GIutamlne, R^Arginlne, S-Serlne, 
^Threonine, V«VaIlne, W-Tryptophan, Y«TyrosIne, 
X^Unknovm, *=Stop codon, /=possible nadeotide ddetion, ' 
V=possible nucleotide insertion 










SL»NSL\*AVTSYEDLGLFAFGLPGKLWAGTOIQ 

NIGAMSSYLLIKTELPAAIAEFLTGDYSRYWYLD 

GQTLLIIICV GIVFPLALLPKIGFLGYTSSLSFFFM 

MFFALVVIIKKWSIPCPLTLNYVEKGFQISNVTDD 

CKPKLFHFSKESAYAIJ > TN1AFSFLCHTSILPIYCE 

LQSPSKKRMQNVTOTAIALSFLIYFISALFGYLTF , 

YD/GTTKAQRGEVTCHRIKDKVESELLKG* * *IP* ' 

SHDWVMT\VKLCILFAVLL\TVPLIHFPARKAVT 

MMFFSNFPFSWIRHR.ITlAlJvnil^ 

WGWGASTSTCLIFIFrHjLFYLKLSREDFLSWKK 

LGVGCFC/ll^FKTSllJlNSl^VYIILPASRKSIYFK 

I 


2967 


A 


3 

• 


3222 


SGIVVRALWREKKPGGGRRVKKRNPGRQAVGH 

1EEDPPRVGTPWKEHTGPGPQEGSTMEAAHAKT 

TEECXAYFGVSETITGLTPDQVKRNLEKYGLNELP 

AEEGKTLWELVIEQFED1XVRILLLAACISFVLA 

WFEEGEETITAFVEPFVILLILIANAIVGVWQERN 

AENAIEALKEYEPEMGKVYRADRKSVQRIKARD 

IWGDIVFVAVGDKVPADIRIIJVIKSTTLRVDQSIL 

TGEYVSVIKHTEPWDPRAVNQDKKNMLFSGTNI 

AAGKALGIVATTGVGTEIGKIRDQMAATEQDKT 

PLQQKLDEFGEQLSKVISUCVAVWUNIGHFNDP 

VHGGSWFRGAIYYFKIAVALAVAAIPEGLPAV1T 

TCIALGTRRMAKKNATVRSLPSVETLGCTSVICS 

DKTGTLTTNQMSVCKMFIIDKVDGDICLLNEFSIT 

GSTYAPEGEVLKNDKPVRPGQYDGLVELATICA 

LCNDSSLDFNEAKGVYEKVGEATETALTTLVEK 

MRVFhTTOVRSI^KVERANACWSVIRQLMKKEFT 

I^SRDRKSMSVYCSPAKSSRAAVGNKMFVKGA 

PEGVIDRCmVRVGTTOVPLTGPVKEKIMAVIKE. 

WGl;^^-^^^^I^TRr;TPPKFPE^^VLDDSA^ ; 

LEYETDLTr VGWGMLDPPRKLVTG^ - XCRDA I 

GIRVIMITGDNKGTAIA1CRRIGIFGENEBVADRA 

Y\TGREFDDL\PLAEQ\REACRRACCFARVEPSHK 

SlirVEYLQSYDEITAMTGDGVNDAPAIJKXAEIGI 

AMGSGTAVAKTASEMVLADDNFSTTVAAVEEGR 

AIYNNMKQFIRYLISSNVGEVVCIFLTAALGLPEA 

IJPVQLLWVNLVTDGLPATALGFNPPDLDIMDRP 

PRSPKEPLIVSGWLFFRYMAIGGYVGAATVGAAA 

WWFLYAEIX5PHVNYSQLTHFMQCTEDNTHFEGI 

ix:evfeapepmtmalsvlvtiemcnalnslsen 
qsijlrmppwvniwllgsiclsmslhflilyvdplp 
mifklraldltqwlmvlkislpvigldeilkfva 
rnyleg*lfpllhl*arvtdpederrk 


2968 


A 


3 


2414 


GARSCSRIXjRCTFPLWKGREMEVRKLSISWQFLI 

VLVLILQILSALDFDPYRVLGVSRTASQADIKKA 

YKKLAREWHPDKNKDPGAEDKFIQISKAYEILSN 

EEKRSNYDQYGDAGENQGYQKQQQQREYRFRH 

FHENFYFDESFFHFPFNSERRDSEDEKYLLHFSHY 

VNEVAPDSFKKPYLKITSDWCFSOrnEPVWKEV 

lQELEELCr VuIUVYHAGYERRLAriHLOArid 1 PSI 

LGENGKISFFHNAWRENLRQFVESLLPGNLVEK 

VTNKbmam^GWQQENKPHVllJD 

YKLTAFAYKDYLSFGYVYVGIJRGTEEMTRRYNI 

MYAPTLLVFKEHINRPADVIQARGMKKQIIDDFI 
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SEQ1D 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCysttine, D^Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=Gtydne, H=Hlstidine, 
I=lsoleudne, K=Lyslne, L^Leodne, M=Methlonlne> 
N=Asparagine, P=Proline, Q=G!utamlne, R=>Arg(ninc, S=Serine, 
T«TbreonJne, V»Valfne, W-Tryptopban, Y=Tyrosine, 
X^Unknown, **=Stop codon, possible nucleotide ddetion, 
V=possible nudeotide insertion 










TRNKYLIAARLTSQKIJ1IEIXPVKRSHRQRKYC 

VVLLTAETTKLSKPFEAFI^FAIjtf^ 

VYSNRQQEFADTLLPDSEAFQGKSAVSILERKNT 

AGRVVYKTLEDPWIGSESDKFILLGYLDQLRKDP 

ALLSSEAVLPDLTDELAPVFLLRWFYSASDYISD 

CWDSIFHNNW\REMMPLLSLFSALFILFGTVIVQ 

AFSDSNDERESSPPEKEEAQEKTGKTEPSFTKENS 

SKIPKKGFVEVTELTDVTYTSl^VRLRPGHMNV 

VLILSNSTKTSLIXJKFALEVYTFTGSSCLHFSFLSL 

DKHREWIJEYLl^FAQDAAPIPNQYDKHFMERDY 

TGYVLALNGHKXYFCIJFKPQKTVEEGGKP*GSC 

SDVDSSLYLGESRGKPSCGLGSRPDCGKLSKLSL 

WMERLLEGSLQRFYBPSWPELD 


2969 


A 


48 


1117 


KGI^PIXJVI^AFAPLDCENfWLKVFTTFLSFATG 

ACSGLKVTVPSHTVHGVRGQALYLPVHYGrTTTP 

ASDIQIIWIJTERPHTMPKYLLGSVNKSVVPD^GI 

P/YISSP*CHPMASLLINI^FPDEGNYIVKVM<^ 

NGTLSASQKIQVTVDDPVTKPWQIHPPSGAVEY 

VGNMTLTCHVEGGTRLAYQWLKNGRPVHTSST 

YSFSPQNhTTLHIAPVTKEDIGNYSCXVRNPYSEM 

ESDnMPOYYGPYGLQVNSDKGLKVGEVFTVDL 

GEAILFDCSADSHPPOTYSWIRRTDNTTYIIKHGP 

RLEVASEKVAQKTMDYVCCAYNNITGRQDETHF 

TVHTSVGMCDIQGRDPNKT 


2970 


A 


68 

• 


936 


HSALLTHSSFCVFTLCQDFFTYSSMSEEVTYADL 
QFQNSSEMEKIPEIGKFGEKAPPAPSHVWRPAAL 
FLTLLCIXLLIGLGV1ASMFHVTLKIEMKKMNKL 
QNISEELQRMSLQLMSNMNISNKJRNLSTTLQTI 
ATKLCRELYSKEQEHKCKPCPRRWIWHKDSCYF 
I^DDVQTWQESKMACAAONASIXKINNKNALE 
FU 'JSQSRSYDYWLOLSPEEW r*,7 3WYE n G* YN T ° V r ~ 
SAWVHC^APDu liilYCGr^^RLYVQYYHC: V'x ' 
QRMICEKMANPVQLGSTYFREA | 


2971 


A 


912 


2287 


VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAPRF > 

LVAFAYWNHYLSCT3PCSCYRPLCRLNFGLNVV 

ENLALLVLTYVSSSEDF/TWVPG*GRSGEVFPEGT 

GLP1JPHSDLPTSWCGHSLQCGSQSSFPPAIHENAF 

IVFTASSLGHMLLTCILWRLTKKHTVSQE^ 

AGAPRQPRRKSRTSVLRIRVMVRWELSSNGNPG 

RGVLGLGLGLGNKLRVVGQNLGL* HC VWV VWE 

TGE*KRWRLQMGIE*GVASRRQ*VRNSVRGLVC 

HNSSAPPMYMGFFSPTVFGGGVGG*LHVTFILHP 

PEVEAAGIPLLLGPSLPQRQGREHIWILAAPACA 

PrTfl)R*WErTlEIRPSP*ELGlJlGEPTLSYPASCRVI 

RQPIP*DRKSYSWKQRLFIINFISFFSALAVYFRHN 

MYCEAGVYTIFAILEYTVVLT^^^1AFH^^'AWWD 

FGNKELLITSQPEEKRF 


2972 


A 


1734 


246 


GGILSGRDGRTALPRPREPAERTAGLRRDMRPQE 

IJ>R1^^IXLUXLLLPPPPCPAHSATRFDPTWES 

LDARQLPAWFDQAKFG1FIHWGVFSVPSFGSEWF 

WWY WQKEKIPKYVEFMKDN YrPSrKYEDFGPL 

FTAKFFNANQ\WADIFQASGAKYIVLTSKHHEGF 

TLWG\SEYSWNWNAIDEGPKRDrVKELEVAIRNR 

TOLRFGLYYSDFEWFHPIJFLEDESSSFHKRQFPVS 

KTLPELYELVNNYQPEVLWSDGDGGAPDQYWN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«=Alaninc OCysteine, JD»Aspartic Add, 
E=Clutamic Add, F»Pbenylalanine, G-Glycine, R=Histidine, 
I^bolcucine, K=Lysine, LHLeudnc, M^Methionine, 
N=A8paragf ne, P=Proline, (XJlutaraioe, R=Arginine, S=Serine, 
•^Threonine, V^Valine, W^Tryptophan, Y-Tyroslne, 
X ra llnknOTvn, *=6top codon,A=possib!e nudeotide ddetion, 
V=possible nudeotide insertion 










STGrlAWLYNESPVRGTVVTNDRWGAGSICKHG 

GFYTCSDRYNPGH1XPHKWENCMTIDKLSWGY 

RREAGISDYLTIEELVKQLVETVSCXjGNLLMNIG 

PTLIXmSWFEERLRQMGSWIJmJGEAIYETHT 

WSQNDTVIPDVWYTSKPKEKLVYAIFIXWrTS 

GQLFLGHPKABLGATEVKLLGHGQPLNWISLEQN 

GIMVELPQLTTHQMPCKWGWALALTNVI 


2973 


A 


24 


1133 


SVPRAGGDMETGAAELYDQALLGILQHVGNVQ 

DFLRVLFGFLYRKTDFYRLLRHPSDRMGFPPGAA 

QALVLQVFKTFDHMARQDDEKRRQELEEKIRRK 

EEEEAKTVSAAAAEKEPWVPVQEffiroSTTELDG 

HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGA 

AEVPR\EPPILPRIQEQFQKNPDSYNGAVRENYTW 

SQDYTDLEVRVPVPKHWKGKQVSVALSSSSIRV 

AMLEENGERVLMEGKLTHKINTESSLWSLEPGK 

CVLVNLSKVGEYWWNAILEGEEPIDIDKINKERS 

MATVDEEEQAVLDRLTFDYHQKLQGKPQSHEL 

KVHEMLKKGWDAEGSPFRGQRFDPAMFNISPGA 

VQF 


2974 


A 


271 


1854 


MQFGRAHGDCVSGAQLCGCPSMDDYMVLRMIG 

EGSFGRAIJLVQHESSNQMFAMKEIRLPKSFSNTQ 

NSRKEAVLLAKMKHPNIVAFKESFEAEGHLYIV 

MEYCDGGDLMQKIKQQKGKLFPEDMILNWFTQ 

MCIXjVNHIHKKRVLHRDIKSKNIFLTQNGKGKL 

GDFGSARLLSNPMAFACTYV GTPYYVPPEIWEN 

LPYNNK5DrWSLGCILYELCTLKHPFQANSWKNL 

ILKVCQGCISPLPSHYSYELQFLVKQMFKRNPSH 

RPSATI1JLSRGTVARLVQKCLPPEIIMEYGEEVLE 

EIKNSKHNTPRKKTNPSRIRIALGNEASW 

DRKGSHTDLESINENLVESALRRVNREBKGNKSV 

HIi?XASSPW.j^PQWE>.>^T>TAT,TALEN/o^T 

SSLTAED?-UCC1-'. XYSKKITRK^ ; 

NILKNADl^Cv \PQTYTIYRPGS\EGFUCGPLSEK 

ASDSVDGGHLS\ r ^PERIJEPGLDEEim>FEEED 

DNPDWVSELKKRAGWQGLCDR 


2975 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

ARNTVNTGELAAIKVIKLEPGEDFAWQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAEIXJPPMFDLHPMRAIJT^MTKSNF 

QPPKLKDKMKWSNSFHHFNTKMALTKNPKKRPT 

AEKLLQHPFVHTQHLTRSLAIEIJJ^^ 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQGXGYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKJP 

PPXJ>PKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\akpsqvpprpppprlpphkpyal-fungmssrqlng 

ERIXjSLCQQQNEHRGENI^RKEKKDVPKPISNG 

l^PTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNTNELHETSMEQIJPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=Ciutaroic Add, Phenylalanine, OGrydne, H=Hl3tidine, 
I=l50leudnt, K=Lysine, L^teudue, ^Methionine, 
N^Asparagfne, ^Proline, Q«Glutamine, R«Arginine, S=Serine, 
T=Threonine, V=Va!ine, W-Tryptophan, Y=Tyroslne, 
X-Unknown, **5top codon, /^possible nudeotide ddetion, 
V=possJble nudeotide insertion 










RQMQK1J>VAIPAHKU>DRILPRKFSVSAKIPETK 

WCQK(XVVRhTPYTGHKYLCGALQTSIVLLEWV 

EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 

LVCVGVSRGKDPT^QVVRFETWPNSTSSWFTES 

DTrXJTh^HVTQLERDraVCLD^ 

LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTrUFRLLGSDRVVVIJES 

RPTDNPTANSNLY1LAGHENSY 


2976 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

AKNTW^EIjUUKVIKL^ 

D\CKHP\DIVAYF\GSYLVRRDKLWIVCMEF\CGSGS 

\LQDIYHVTGPLSELQ1AYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIEIAEIXJPPMFDIJffNIRALr^MTKSNF 

QPPKLKDKMKWSNSnffiDFVKMALTKOTKKRPT 

AEKLLQHPFVTQHLTRSI^EIJJDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEEUIQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPrTPKVHMGACFSKVr^GCPLKmCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 

RQMQKLPVAIPAHKLPDRILPRKFSVSAKIPETK 

WCQKCCWRNPYTGHKYLCGALQTSIVLLEWV 

^PMQKFMLIP ^ HIPFPIPCPLKMFErvlL V : ' QEVP 

LVCVGVSRGRDFNQVVR* T , NFNSTL SWFiES 

DTPQTKVTHVTQLERDTIL V CLDCCIKIVNLQGR 

LKSSRIQ.SSELTFDFREESWC1XJDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRWVLES 

RPTDNPTAN SNL YILAGHENS Y 


2977 


A 


174 


1543 


YSLRKGITrTQAGAMVHIKKGELTQEEKELLEVI 

GKGTVQEAGTLLSSKNVRVNCLDENGMTPLMH 

AAYKGKIJDMCKLLLRHGADVNCHQHEHGYTA 

LMFAAI^GNKDITWVMLEAGAETDVVNSVGRT 

AAQMAAr^GQHDCVTIIN>^^ 

GLDKEPKLPPKlJVGPUnCIITTTN^ 

NEOTLLTEEAAIJ4KCYRVMDLICEKCMKQRDM 

NEVlAMKMHYISCIFQKCINnJGDGENKLDTLIK 

SLLKG\RASDGFPVYPEKILRESIRK\FPYCEATLL 

QQLVRSIAPVEIGSDPTAFSVLTQATTGQVGFVDV 

EFC7TCGEKG ASKRCSVCKMVIY CDQTCQKTHW 

FMCKICKNLKDIYEKQQLEAAKEKRQEENHGK 

LD VNSNCYNEEQPEAEV GISQKDSNPEDSGEGK 

KESLESEAELEGLQDAPAGPQVSEE 


2Q78 


A 




M77 


SDDlJlTGIJQDVQDAfiSlJU^PGVYEVLFYNETE 
DCPGMMLWRYPEPRGLTLVRITPWFbnTEDPDI 
STADLGDVLQDPCSLEYWDELQKV^AFREFNL 
SESKVCELQLPDINLVNDQKKLVSSDLWRIVLNS 
SQNGADDQSSASESGSQSTCDPLVTPTALAACTR 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nncleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nncleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Aianine OCysteine, ENAspartic Add, 
E=Glutaraic Add, ^Phenylalanine, G^Gtydne, H»Histidine, 
I=Isoleudne, KpLysine, Lr=Leudne, IVt^ethionine, 
N^Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V-Valine, W^Tryptopban, Y«=Tyrosine, 
X=Unknown, *=^top codoo, /^possible nndeotide ddetion, 
Wpossibie nudeotide insertion 










WSCFTPWFVPSLCVSFQFAHLEFHLCHHIJDQLG 

TAAPQYLQPFVSDRNMPSEL£YMIVSFREPHMYL 

RQWNNGSVCQEIQr^QAIXIKLLECRNVTMQS 

VVKP^SlFGQMAVSSDVVEKLLIXnVI\nDSV^ 

LGQHVVHSLNTAIQAWQQNKCPEVEELVFSHFV 

ICNDTQETLRFGQVDTOENILLASLHSHQYSWRS 

HKSPQLIJnCIEGWGNWRWSEPFSVDHAGTFIRT 

IQYRGRTASLIIKVQQLNGVQKQinCGRQnCSYL 

SQSIELKWQHYIGQDGQAWREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLESK 

APEYSIVIQVPSSNSSIIYWCTVLTLEFNSQVQQ 

RMIVFSPLFIMRSHLPDPIIIHLEKRSLGLSETQIIP 

GKGQEKPLQNIEPDLVHHLTFQAREEYDPSDCA 

VPISTSLIKQIATKVHPGGTVNQEJDEFYGPEKSL 

QPI^YNKKDSDRNEQl^QWDSPMRVKLSIWKP 

YVRTLLIEUJWA1XINESKWDLWLFEGEKIVLQ 

WAGKIHPPNFQEAFQIGrYWANTNTVHKSVAIK 

LVHNLTSPKWKIXjGNGEVVTLDEEAFVDTEIRL 

GAFPGHQKIXX5FCISSMV<K?GIQnQIEDKTniNN 

TWQIFYKPQLSVCNPHSGKEYFRVPDSATFSICP 

GGEQPAMKSSSLPCWDLMPDISQSVLDASLLQK' 

QIMLGFSPAPGADSSQCWSLPAIVRPEFPRQSVA 

WLGNFRENGFCTRAIVLTYQEHIXjVTY^ 

PSPRVIIHNRCPVKMLIKENIKDIPKFEVYCKKIPS 

ECSIHHELYHQISSYPDCKTKDLLPSLLLRVEPLD 

EVTTEWSDAIDINSQGTQVVFLTGFGYVYVDVV 

HQCGTVTITVAPEGKAGPILThnr^RAPEKIVTF/K 

MFITQLSLAWDDLTHHKASAELLRLTLDNIFLC 

VAPGAGPLPGEEPVAALFELYCVE1CCGDLQLDN 

QLYNKSNFHFAVLVCQGEKAEPIQCSKMQSLLIS 

KFARLY ~>TIKTLFDT YLPNSRLAGH3TH 1 

I^GGKQVLrMQVTQHARALVNPVKLRKLVIQPV 

NLLVSIHASLKL YIASDHTPLSFSVFERGPIFTTAR 

QLVHALAMHYAAGiVLFRAGWVVGSLDILGSPA 

SLVRSIGNGVADFFRLPYEGLTRGPGAFVSGVSR 

GTTSFVKHISKGTLTSITN1ATSLARNMDRLSLDE 

EHYNRQEEWRRQLPESLGEGLRQGLSRLGISLLG 

AIAGIVDQPMQNFQKTSEAQASAGHKAKGVISG 

VGKGIMGVFTKPIGGAAELVSQTGYGILHGAGLS 

QIJPKQRHQPSD\VHAIX3APNSHVKYVWKMLQS 

LGRPEVHMALDVVLVRGSGQEHEGCLLLTSEA^L 

FWSVSEDTQ<NArT\rreiIX:AQDSKQNNIXTV 

QLKQPRVACDVEVDGVRERLSEQQYNRLVDYIT 

KTSCHLAPSCSSMQIPCPVVAAEPPPSTVKTYHY 

LVDPHFAQVFLSKFTMVKNKALRKGFP 


2979 


A 


255 


2673 


AWLFPASVLCPRCLTGSAVGSAEWKSLWLFPFS 

SRPTLGHLDSKPSSKSNMIRGRNSATSADEQPHIG 

NYRIXKTIGKGNFAKVKIARHILTGKEVAVKIID 

KTQLNSSSLQKLFREVRIMKVLNHPNTVTKLFEVIE 

TEKTLYLVMEYASGGEVFDYLVAHGRMKEKEA 

RAKFRQIVSAVQYCHQKFIVHRDLKAENLLLDA 

DMNIKIADFGFSNEFITGNKLDTFCGSPPYAAPEL 

FQGKKYDGPEVDVWSLGVILYTLVSGSLPFDGQ 

NLKELRERVUIGKYRIPFYMSTDCENLLKKFLIL 
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SEQID 
NO: 


Mctbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystdne, D=Aspartic Add, 
E«GIutamic Add, ^Phenylalanine, G=Grydne, H=Hlstidlne, 
I»Isoleudoe, K=Lystae, L=Lendne, M^Methionlne, 
N^Asparagine, P-Proline, Q=Glutamlne, R»ArgJnjne, S=Serine, 
T=Threonine, V-Vallne, W-Tryptopban, Y=Tyrosine, 
X«Unknown, *=5top eodon, possible oudeotide deletion, 
V=possible oudeotide insertion 










NF^KRGTLEQIMKDRWMKVGHEVDDELKPYGEP 

LPVDYKDPRRTELMVSMGYTREEIQDSLVGQRYN 

EVMATYLLLGYKSSELEGDTTTLKPRPSADLTNS 

SAPSPSHKVQRSVSANPKQRRFSDQAGPAIPTSNS 

YSKKTQSNNAENKRPEEDRESGRKASSTAKVPA 

SPLPGLEMCKTTPTPSTNSVLSTSTNRSRNSPLLNE 

RASL\GQGFHPEWAKTALTMPGSRASTASASAA 

VSAARPRQHQKSMSASVHPNKASGLPPTESNCE 

VPRPRQVCWGSCTAPQRVPVASPSAHNISSSGGA 

PDRTNFPRGVSSRSTFHAGQ1JIQVR\DQQNLPYG 

VTPASPSGHSQGRRGASGSBFSKFTSKFVRRNLNE 

PESKDR\VETLRPHW\NSGGNDKEKEEFREAKPR 

SLJRFIWSMKTTSSMEPNEMMREIRKVLDANSCQ 

SELHEKYMLLCMHGTPGHEDFVQWEMEVCKLP 

RLSLNGVRFKRISGTSMAFKNIASKIANELKL 


2980 


A 


120 


3433 


NOLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 

LGGIJ'ETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQQMIARCPKSAEThTOQDINNLKEKWESVE 

TKLNER\KTVKLEEAL>JIA\MEFHNSL\QDFINWLT 

QAEQTIJ^rVASRPSLIIJDTVIJQIDEHKVFANEVN 

SHREQHELDKTGTHIJCYFSQKQDVVLIKNLLISV 

QSRWEKWQRLVERGRSLDDARKRAKQFHEAW 

SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLDDQjHKEFMKKLEEKRAE 

^NKATTMGDT/I AICHPD3ITTIKHWIT1- ' rtfP 17 

VjuVw AKQHQQRLASaLAG. AY. ^ELLEaLLAW 

LQWAETTLTDKDKEVIPQEIbEVKAIJAEHQITM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSH1PV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANTOFDIWRKKYMRWMNHKKSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVREJISTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKF^ 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

WATTIPKIIJIPLTRNYGKPW^^ 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


2981 


A 


120 


3433 


NCLLLQAKGFHGEEBDLQQWLTDTERHLLASKP 

LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQQMLARCPKSAEThnDQDINNLKEKWESVE 

TKLNER\KT\KLEEALNLA\MEFHNSL\QDFINWLT 

QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 

SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 

QSRWEKWQRLVERGRSLDDARKRAKQFHEAW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysteine, B=Aspartfe Acid, 
{/^Glutamic Acid, ^Phenylalanine, G=Giydne, H=Histidine, 
t=Isolendne, K^Lysine, L=Leudne, M^Methiontoe, 
N=A5paraginc, P=ProHne, 0=GIut8raIoe, R^Arglnine, S^Serine, 
TWThreonine, V»Vaiine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^possible nudeotide deletion, 
V=possibIe nudeotide insertion 










SKLMEWLEESEKSLDSELEIANDPDKDCTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTS3LADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEAX 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA ' 

RELBEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEKJRAE 

LNKATTMGDTVIjyCHPDSITTIKHWm 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRKRFPASSLYPSGSQTQIETKKPRVNL 

LVSKWC^VWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWMNHKKSRVMDFFRMDK 

IXJIXjKITRQEITIXjII^KFPTSRLEMS 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAK(XCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRELRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTS VS SQAAQAASPQ 

VP ATTTPKILJiPLTRNY GKP WLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


2982 


A 


1 


2065 


MAAGGAEGGSGPGAAMGDCAEIKSQFRTREGF 

YKLLPGDGAARRSGPASAQTPVPPQPPQPPPGPA 

SASGPGAAGPASSPPPAGPGPGPALPAVRLSLVR 

LGEPDSAGAGEPPATPAGLGSGGDRVCFNLGRE 

LY~ v PGCX:R^GSQF^r rrpir TPFLr ? T .r c7 P T X *KP! 

Di\ Ri YKG TQFrCtL . "in QI - AlETLsLL viitSAG 

QVQYLDIJHCKDTSKLF:>IE£1^ 

ESESLFLASHASGHLYL^TVSHPCASAPPQYSLL 

KQ\AWGFSrTAAKSKAPR>:TLAKWAVGEGPLNE 

FAFSPlXjRHI^CVSQIXX:LRWHro^ 

KSYFGGLLCVCWSPIXjRYVVTGGEDDLVTVWS 

FIEGRVVARGHGHKSWVNAVAFDPYTTRAEEA 

ATAAGADGERSGEEEEEEPEAAGTGSAGGAPLSP 

LPKAGSITYRFGSAGQDTQFCLWDLTEDVLYPHP 

PLARTRTLPGTPGTTPPAASSSRGGEPGPGPLPRS 

LSRSNSLPHPAGGGKAGGPGVAAEPGTPFSIGRF 

A 1 L 1 LQbKiUJKuAJiKiiHiUvYHS 

SGSGGEKPSGPVPRSRLDPAKVLGTALCPRIHEV 

PIXEr^VCKmQERLTVLLrTJEIX^ 

TWARPGKAFTDEETEAQTGEGSWPRSPSKSWE 

GISSQPGNSPSGTW 


2983 


A 


3855 


220 


RRFRLSAHRAQPCCRCRGLEMPRGVFQQLSNLV 
I/JELNANL^NLTSAFEKATAEKIKCQQEADATN 
RVILLAl^VGGl^ENmWAESVENFRSC^VTL 

pr;r>VT T A PV QWfiV W 1 ' kT VVT? "NTFT TOfPVPWTPVT 
yJKJxJ ¥ Lt LAjDJW vol VUIr lA-M xvlNITJLdVl^Jvr WlT II 

HNLKWIPITNGIJDPI^IXTDDADVATWNNQGLP 
SDRMSTENAmGNTERWPLIVDAQLQGIKWIKN 
KYRSELKAIRLGQKSYIJDVIEQATSEGDT1X1ENI 
GETVDPALDPLLGRNTIKKGKYIKIGDKEVGVPP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AIanine OCysteine, D=Aspartic Add, 
E"=GlutamIc Acid, F-Pbenylalanine, G=GIydne, H=H1stldine, 
Hsoleudne, K=LysIne, L=Leudne, M=Methtonine, 
N^Asparaglne, P»Proline, Q=Glutamine, R«Arg?nlne, S=Serine, 
TKTfareonine, V«Valine, W-Tryptophan, Y«Tyrosine, 
X=Unknown, *«Stop codon, /=posable nodeotlde ddetion, 
V=possible nudeotide Insertion 




• 






QWPDPTHQVLQPTIX3ARDAGSVH\LINFL\rrRD 

GLEDQLLAAWAKERPDLEQLKANLTKSQNEFK 

IVLKELEDSLLARLSAASGNFLGDTALVENLETT 

KHTASEIEEKVVEAKITEVKINEARENYRPAAER 

ASLLYFILNDLNKINPVYQFSLKAFNVWEKAIQR 

TTPANEVKQRVINLTDEITYSVYMYTARGLFERD 

KLIFLAQVTFQVLSMKKELNPVELDFLLRFPFKA 

GWSPVDFLQHQGWGGIKALSEMDEFKNLDSDI 

EGSAKRWKKLVESEAPEKEIFPKEWKNKTALQK 

LCMVRCIJIPDRMTYAIKNFVEEKMGSKFVEGRS 

VEFSKSYEESSPSTSIFF1LSPGVDPLKDVEALGKK 

LGFTIDNGKLHNVSLGQGQEWAENALDVAAEK 

GHWVILQNIHLVARWLGTLDKKLERYSTGRHED 

YRWIRAEPAPSPETHIIPQGILENAIKITNEPPTGM 

yanlykaldutqdtlemctkemefkcmlfal 

cyfhawaerrkfgaqgwnrsypfnngdlusi 

nvlynyleanpkvpwddlrylfgeimygghitd 

dwdrrix:rtyi^eyirteml^gdvijlapgfqipp 

nldykgyheyidenlppespylyglhpnaeigfl 

tvtseklfrtvlemqpketdsgagtgvsreekv 

kavlddilekipetthmaeimakaaektpyvvv 

afqecermniltnemrrslkelnlglkgeltitt 

dvedlstalfydtvpdtwvaraypsmmglaaw 

YANLlXRlRELEAw l IDFALPTl VWLAGFFNPQS 
FLTAIMQSMARKraWLDKMCI^V^ 
DMTAPPREG S YV YGLFMEG AR WDTQTG VIAEA 
RLKELTPAMPVTFIKAIPVARMETKMYECPVYKT 
RIRGPTYVWTFNLKTKEKAAKWII^AVALIXQV 


2984 


A 


2 


1464 


FVLFPGIAMETPGASAS SLLLPAASRPPRKREAGE 

AGAATSKQRVLDEEEYIEGLQTVIQRDFFPDVEK 

LQAQKEYLEAEEli.il ^LERM^QIAIK*^ * * 3KM 

SREP"- VTPATI- ETPE VHAGTGV . - jiw . }>RG 

RGLEljGl^GEEEEKEPLPSLDVFLSRY SEDNAS 

FQElMEVAKERSRARHAWLYQAEEEF^r ? ^KDN 

LELPSAEHQAIESSQASVETWKYKAKNSLK'V ~? 

EGWDEEQUKKPRQVVH^^ 

CQLQQAAALNAQHKQGKVGPDGKELIPQESPRV 

GGFGFVATl^PAPGVNESPMMTWGEVEOTPLRV 

rAjocl FY VDR FPuPAF KILfyTCRREKLuLKMANE 

AAAKNRAKKQEAL1»VTENLASL1PKGLSPAMS 

PALQRLVSRTASKYTDRALRASYTPSPARSTHLK 

NPGPVGCRPPQSTPGA/PGSATRTPL\TQDPA\SIT 

DNLLQLPARRKASDFF 


2985 


A 


1890 


178 


ASTQEAG1XSPPGVGAQRCW1>IFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGOGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GnfionYmsinnnTn a a fiSRGYRO\or>R (tor cr g 

GSGGGGSVGGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NKITFVQGl^ENVT^ 

QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alaoine OCystelne, D=>Aspartic Acid, 
E=GIotamfc Add, ^Phenylalanine, G^GIydne, H^HIstidine, 
l=Isokudne, KpLysine, L=Leudnt, M=Methlonine, 
N=Asparagine, P^Proline, Q=Glutamine, R=ArginJnc, S=Serine, 
T=Threonwe, V=VaJine, W=Tryptophan t Y«Tyrosine, 
X>=Unknown, *=Stop codon, /•^possible nndeotide ddetf on, 
V=possible nudeotide insertion 










WFDGKEFSGNPKVSFATRRADFNRGGGNGRGG 

KUKOUJf MUlwvi I uuuuoUuUvjKviOr rauugu \j 

GG(^RAGDWKCPNPTCENMW^WRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2986 


A 


1890 


178 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIW(^LGENVraSVADYFKQIGlIKTNKKTG 

QPMINLYTORETGKLKGEATVSFDDPPSAKAAID 

WFDGKEFSGNPKVSFATRRADFNRGGGNGRGG 

KuKuuPMGRGGYGGGG SGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 




A 

A 


137o 




GGAKAGGAPrlPirrLPFRHVGGLSAAPEEVEGML 
WAGARQHGRNWRKRETSPGTQGPLPPVPRA^PP 
GPlXiXPHAlAPTLSWAIPRQQCSPQPGRLNALPPD 
RCSGPHFGDRAPESCFPGACSVSGACAFKGTRPA 
CPPQEPSLRSSRNRLREGQTFGRMEI 


2988 


A 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAIDP 
• \- TV/APLP\ YAAJFJ ^GNAx ^ T A ^/Y \QKV AR 
fOWGATWLLKI.A v AI ^ CCLSLPILAVPiARGGH 
WPYGAVGCRALFSriLLTMYASVIXLAALSADLC 
FLALGPAW\O.RFS/GA^GVQVACGAAWTLALL 
LTVPSA1YR1U/HQEHFPARLQCVVDYGGSSSTEN 

A"\/TT A TOET T7/TCT I^OT \T A If A 0/™*TJTO ATT /Ttr/ A A TiTlfy 

A V lAiKrLror LurLYA VAc^ItoALlLCWAARRC 

RPLGTAIWGFl^CWAPYHIXGLVLTVAAPNSA 

IXARAUlAEPLrVGlJUAHSCLNPM^ 

IJIRSIJAACTWAIJRESQGQDESVDSKKSTSHDL 

VSEMEV 


2989 


A 


27 


4074 


KSQLFCFWVGKAGDILSGDQDKEQKDPYFVETP 

YGYQU^LDFUCYVDDIQKGOTlKRmQKlUaCPS 

VPCPEPRTTSGQQGIWTSTESLSSSNSDDNKQCP 

Nl^IARSQVTSTPISKPPPPLETSLPFLTIPENRQLP 

PPSPQLPKHhOWTKTLMETRRRLEQERATMQM 

TPGEFRRPRLASFGGMGTTSSLPSFVGSGNHNPA 

KHQLQNGYQGNGDYGSYAPAAPTTSSMGSS1RH 

SPI^SGISTPVTWSPMHLQHIREQMAIALKRLKE 

LEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRA 

ASQINVCGVRKRSYSAGNASQLEQLSRARRSGG 

LEQKIQDSSCEASSELRENGECRSVAVGAEENMN 
D1YVYHRGSRSCKDAAVGTLVEMRNCGVSVTEA 
MLGVMTEADKEffiLQWTffiSLKEKlYRLEVQLR 
ETTHDREMTKLKQELQAAGSRKKVDKATMAQP 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Atantae OCysteine, D=Aspartic Add, 
E-Glutamlc Add, ^Phenylalanine, G-=Glydne, H-Histidine, 
IsOsolettdne, K^Lysine, L=0Leudne, M^^MethJonine, 
N-Asparagine, P=Proline, Q^Glutamlne, R-ArgtnSne, S=Serine, 
T«Tbreonine, V=Valine, ^Tryptophan, Y-Tyrosine, 
X=UnknowD, *=Stop codon,/=possible nudeotide ddetion, 
V*possible nudeotide insertion 










LVFSKWEAWQTRDQMVGSHMDLVDTCVGTS 

VETNSVGISCQPECKNKWGPEU'MNWWIVKER 

VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTE 

ESVNnDLTLLKTNLNLKBVRSIGCGDCSVDVTVCS 

PKECASRGVNTEAVSQVEAAVMAVPRTADQDT 

STDLEQ\OIQFmTETATLIESCTNTCLSTLDKQTS 

TQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLL 

SGHSGFDRPSAVKTKESGVGQININDNYLVGLK 

MRTIACGPPQLTVGLTASRRSVGVGDDPVGESLE 

NPQPQAPLGMMTGLDHYIERIQKLLAEQQTLLA 

ENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVM 

KSASTEELRNPDFQKTSLGKTTGSYLGYTCKCGG 

LQSGSPLSSQTSQPEQEVGTSEGKPISSLDAFPTQ 

EGTLSPVNLTDIXJIAAGLYACnWESTLKSIMKK 

KDGNKDSNGAKKNLQFVGINGGYETTSSDDSSS 

DESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAE 

GHHAVNIEGLKSARVEDEMQVQECEPEKVEIRE 

RYELSEKMLSACNLLKNTINDPKALTSKDMRFC 

LNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAISP 

DVLR YVTNLADGNGNTALHY SV SHSNFEIVKLLL 

DADVCKVDHQNKAGYTPIM1AAIAAVEAEKDM 

RIVEELFGCGDVNAKASQAGQTALMLAVSHGRI 

DMVKGLLACGADVNIQDDEGSTALMCASEHGH 

VEIVKLLL AQPGCNGHLEDNEK3 STALSI ALEAGH 

KDIAVLLYAHVNFAKAQSPGTPRLGRKTSPGPTH 

RGSFD 


2990 


A 


69 


1687 


ERLRPGQRAIRGPVPAAGACASLPPRAGPAQGRH 
AALGGAEPGSHLHCGVRLQRREEPGGQQRLLPQ 
RGGSAQTGHQHPGPYECQCPGPQPGGTTPALLSL 
ILEETRGPPASANPDKDHSTQPGTMGRKKIQISRI 

- i J AGEGGDPALPRPRLYPAAPAMPSPDWYGAL 
PPPG GuPSGLGEALPAQSRPSPFRPAAPKAGPPG 
LGHPLFSPSHLTSKTPPPLYLPTEGRRSDLPGGLA 
GPRGGLNTSRSLYSGLQNPCSTATPGPPLGSFPFL 
PGGPPVGAEAWARRVPQPAAPPRRPPQSSIKSER 
LFLRPPGAPATFLRPSPIPCSSPGPWQSLCGLGPP\ 
CAGCPWPTAGPGRRSPGGTSPERSPGTARARGDP 
\TSLQAFSEKIHTVTAPLRGGGLEVGGWTQSSAG 
GLLSFFLFVCISTNKNARGVRGPEKK 


2991 


A 


3 


1159 


IPQPLHGASPKEEMSLRCGDAARTLGPRVFGRYF 

CSPVRPI^SIJDKKKELLQNGPDLQDFVSGDLAD 

RSTWDEYKGNLKRQKGERLRLPPWLKTEIPMGK 

NYNKLKNTLRNLNXJTTVC 

EYATATATIMLMGDTCTRGCRFCSVKTARNPPP 

LDASEPYNTAKAIAEWGLDYVVLTSVDRDDMP 

DGGAEHIAKTVSYLKERNPKILVECLTPDFRGDL 

KAIEKVALSGLDVYAHNVETVPELQSKVRDPRA 

XfmV^lQT "DVT VXTA VV\7/ , MJr>\/TCVTrCT\irT m rSTTMTYD 

INrL/v^oi-fK V JJKxiAAJv V l^rU V lolv 1 01M1AJL.UI1IN UJs 
QVYATMKALREADVIXXTIXjQYMQPTRRHLKV 
EEY1TPEKFKYWEKVGNELGFHYTASGPVLVRSS 
YKAGEFFLKNLVAKRKTKDL 


2992 


A 


3 


1636 


PVPGVPTSPPSCOPQDMQGPWVLLLLGLRLQLSL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-Alanine OCystdne, D-Aspartic Add, 
E=Clataroie Add, F»Pheny (alanine, G^GIydne, H-Histidine, 
Hsolcudne, K«Lysine, L^Leudne, M=Methlonine, 
N^Asparagine, P^ProIine, Q=GIntamJne, R=ArgfnJne, S=Serine, 
•^Threonine, V^VaUne, W^Tryptophan, Y=OVrosine t 
X«TJnknown, *=Stop codon, /"possible nudeotide deletion, 
V=possible nudeotide insertion 










GVIPAEEENPAFWNRQAAEALDAAKKLQPIQKV 

AK>JLILFLGIX3LGVPTVTAI1ULKGQKNGKLGPE 

TPLAMDRFPYLALSKTYNVDRQVPDSAATATAY 

LCGVKANFQT1GLSAAARFNQCNTTRGNEVISV 

MNRAKQAGKSVGVVTTTRVQHASPAGTYAHTV 

NRNWYSDADMPASARQEGCQDIATQLISNMDID 

VILGGGRKYMFPMGTPDPEYPADASQNGIRLDG 

KNLVQEWLAKHQGAWYVWNRTELMQASLDQS 

VTHLMGLFEPGDTKYEIHIU^PTLDPSUvIEMrEA 

ALRLLSRNPRGFYLFVEGGRIDHGHHEGVAYQA 

LTEAVMFDDAIERAGQLTSEEDTLTLVTADHSH 

VFSFGGYXLRGSSIFGLAPSKAQDSKAYTSIL^ 

GPGYVFNSGVRPDVNESESGSPDYHQQAGWPLS 

SETHGGEDVAVFARGPQAHLVHGVQEQSFVAH 

VMAFAACLEPYTACDIAPPACTTDAAHPVAASL 

PLLAGTLLLLGASAAP 


2993 


A 


3 


685 


DAWARIXKMNRI^GKAKPKAPPPSLTDCIGTVD 

SRAESIDKKISR1JDAELVKYKDQIKKMREGPAKN 

MVKQKALRVIJCQKRMyE(^RDNLA\NSHSTW\ 

TS\HYTIQSLKDTKTTWAMKLGVKEMKKAYKQ 

VKIDQIEDLQDQLEDMMEDANEIQEALSRSYGTP 

ELDEDDLEAELDALGDELLADEDSSYLDEAASA 

PAIPEGVPTDTKNKDGVLVDEFGLPQIPAS 


2994 


A 


1710 


161 


RRCELTPFIDCTLILPKSWGAFPEDVVMQHVSSSQ 

SSQRHVQWPGACPGAGEEQPACSQPSLPLTLPSP 

SHQLQQLMVRGGPAGGQNMNVDLQGVGPGLQ 

GSPQVTLAPLPLPSPTSPGFQFSAQPRRFEHGSPS 

YIQVTSPLSQQVQTQSPTQPSPGPGQALQNVRAG 

APGPGLGLCSSSPTGDFVDASVLVRQISLSPSSGG 

HFVFQDGSGLTQIAQGAQVQLQHPGTPITVRERR 

PSQPHTOSGGITHI^- V QS?Ar AGG^^m \SP- 

SHnT^^TQISSI; ^GQLVQQQQV:/: w?_...~;PL 

GFERTfGVLLPGAGGAAGFGMTSPPFFI:>PSRTA 

WPGLSSLPLTSVGNTGMKKVPKKLEEixPASPE 

MAQMRKQCLDYHHQEMQALKEVFKK YLLLi ■ ■ 

LQHFQGNMMDFLAFKERLYGPLQ AYLRQNDLDI 

EEEEEEVHFEVIhn^EVKWARKHGQrcTPVAIATV 

QLPPRTSAAFPAQQQPLQVLSDGSTVQLPRLSSL 

GFEDSMC 


2995 


A 


3 * 


924 


SAPSGIDASTHAFARCKHPINVRRDPSIPIYGLRQS 

ILDSTTRlXJDCYVDSPALTbnWMARTCAKQNINAP 

APATTSSWEVVRNPLIASSFSLVKLVLRRQLKNK 

CCPPPCKFGEGIO^KRLKHKDDSVMKATQQARK 

RNFISSKSKQPAGHRRPAGGIRESKESSKEKKLTV 

RQDLEDRYAEHV AATxQALPQDSGTAA WKGXRV 

IXPETQKRQQl^EDTLTmGIJTEGYQALYHAVV 

EPMLWNPSGTPKRYSLELGKAKQKLWEALCSQ 

GAISEGAQRDRFPGRKQPGVHEEPVLKKWPKLK 

SKK 


2996 


A 


3 


1713 


GKFGIKPSQRRISGKSTFHSEMEGEDTRDDSLYSI 

T PPT WOT! A POfKT? PrYFTf T-OsJlfT T QPTTPT 'MVTf TT XT 

TEWDYEYKDFGKFVHPSPNLILSQKRPHKRDSFG 
KSFKHhHLDJJFflHNKSNA^ 
NSSYSHHENTrTTGVKPCERNQCGKVLSLKHSLS 
QNVKFPIGEKANTCTEFGKire^ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Aianiue OCystrine, D-Aspartie Add, 
E-Glntaraic Add, ^Phenylalanine, G=Glyctne, H=Histidine, 
I=Isoleucine, K=Lysine» L»Leudne, M=*Methionine, 
N=Asparagine, P=ProIine, Q=€lutaminc, R=Arglnine, S=Serine, 
T=OTbreonine, V»Valine, W~Tryptopban, Y=Tyrosine t 
X^XJnknown, *=Stop codon, Accessible nudeotide deletion, 
\=possible nudeotide insertion 










VEKPHELSKCVNVFTQKPLLSIYLRVHRDEKLYIV 

CTKM/CGKGLHPRNSELIMHEKTHTREKPYKCNE 

\CGKSrTQVSSLLRHQTTrTrGEKLFECSECGKGFS 

LNSALNIHQKIHTGERHHKCSECGKAFTQKSTLR 

MHQR1HTGERSYICTQCGQAHQKAHLIAHQRIH 

TGEKPYECSDCGKSFPSKSQLQMHKRIHTGEKPY 

ICTECGKAFT>niSNLNTHQKSHTGEKSYICAECG 

kaftdrsnfnkhqtihtgekpyvcadcgrafiqk 
selithqrihttekpykcpix:eksfskkphij^vh^ 
rihtgekpyicaecgkafidrsnfnkhqtihtgd 
kpykcsdcgkgftqksvlsmhrniht 


2997 


A 


3 


1763 


AASTRTMGSRHFEGIYDHVGHFGRFQRVLYFICA 

FQMSCGIHYLASVFMGVTPHHVCRPPGNVSQVV 

FHNEfSNWSLEDTGALLSSGQKDYVTVQLQNGEI 

WELSRCSRNKRENTSSLGYEYTGSKKEFPCVDG 

YIYDQNTWKSTAVTQWNLVCDRKWLAMLIQPL 

FMFGGPTGIGATTFGYFVSDRLGRRWLWATSSS 

MFLFGIAAAF A VD YYTFMAARFFLAMV ASGYLV 

VGFVYVMEFIGMKSRTWASVHLHSFFAVGTLLV 

ALTGYLVRTWWLYQMILSTVTVPFILCCWVLPE 

TPFWLLSEGRYEEAQK\TVDIMAKWNRASSCKLS 

ELLSLDLQGPV SN SPTEVQKHNLS YLFYNWSITK 

RTLTVWLIWFTGSLGFYSFSLNSVNLGGNEYLNL 

FLLGVVEIPAYTFVCIAMDKVGRRTVLAYSLFCVS 

ALACGVVMVIPQKHYILGVVTAMNVGKILPIGAA 

FG\LIYLYTAELYPTIVRSLAVGSGSMVCRLASIL 

APFSVDLSSIWIF1PQLFVGTMALLSGVLTLKLPE 

TLGKRI^TTWEEAAKI^ENESKSSKLLLTTNNS 

GLEKTEAJTPRDSGLGE 


2998 

t 


A 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLAAC 

DWGFD?..r-H TT CRYNI F?~ \PT TYNSFAQKLVKF 

KGYDi: 1^: H , . ED WDr CCKGLALDLEDGI4FL 

KIANNGT^XRASHGTKMNITPEVLAEAYGKKEW 

KHrl^DTXiMACRSGKYYFYDNYFDLPGALLCAR 

VVDYLTKLNNOQKTTOFWKDIVAAIQHNYKMS 

AFKENCGIYFPEIKRDPGRYLHSRPESVKKWLRQ 

IJKNAGKILLLITSSHSDYCR1XCA\YILGNDFTDLF 

DIVTTOALKPGFr^HLPSQRPFRTLENDEEQEALP 

SLDKPGWYSQGNAVHLYELLKKMTGKPEPKW 

YFGDSMHSDIFPARHYSNWETVLILEELRGDEGT 

RSQRPEESEPI^KKGKYEGPKAKPLNTSSKKWGS 

IWDSVLGLENTEDSLVYTWSCKRISTYSTL^ 

EAIAELPLDYKFTRFSSSNSKTAGYYPNPPLVLSS 

DETLISK 


2999 


A 


320 


2417 


LRRRKMIPQSLLQTTLrll^IJLFLVQGAHGRGHR 

EDFRFCSQRNQTHRSSUiYKPTPDLRISIENSEEA 

LTVHAPFPAAHPASRSFPDPRGLYHFCLYWNRH 

AGRLHLLYGKRDFLLSDKASSLLCFQHQEESLAQ 

GPPLLATSVTSWWSPQNISLPSAASFTFSFHSPPH 

TOAHNASVDMCELKRDLQLLSQFLKHPQKASRR 

rbAAPASQQLQSLESKLTSVRFMGDMuSrEEDKJ 

NATVWKLQPTAGLQDLHIHSRQEEEQSEIMEYS 

VLLPRTLFQRTKGRSGEAEKRLLLVDFSSQALFQ 

DKNSSQVLGEKVLGIWQNTKVANLTEPVVLTF 

QHQLQPKNVTLQCVFWVEDPTLSSPGHWSSAGC 
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SEQIB 
NCh 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A= Alanine OCystune, D»Aspartic Add, 
E=Glotamk Add, ^Phenylalanine, OGtydne, H=Histidlne, 
I=Isoleudne, K<=Lysine, L=Leudne, M=Methionine, 
N=»Asparaglne, P=Prollne, Q=G!utamine, R=*Arginioe, S=Scrine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, **-Stop codon, /^possible nudeotide ddetion, 
V=possib!e nudeotide insertion 










ETVRRETQTSCFCNHLTYFAVLMVSSVEVDAVH 

KHYLSLLSYVGCVVSALACLVTIAAYLCSRVPLP 

CRRKPRDYTIKVHMNLLIAVFLLDTSFLLSEPVA 

LTGSEAGCRASAIFLHFSLLTCLSWMGLEGYNLY 

RLWEWGTYVPGYLLKLSAMGWGFPIFLVTLV 

ALVDVDNYGPniAVHRTPEGVIYPSMCWIRDSL 

VSYITm.GIJ^SLWLFNMAMLATMVVQILRLRPH 

TQKWSHVLTLLCLSLVLG\LPWALIFFSFASGTFQ 

LVVLYlJF^lITSFQGFLinWYWSMRLQARGGPSP 

LKSNSDSARLPISSGSTSSSRI 


3000 


A 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSP 

RRSRSAAEPAMALSMPLNGLKEEDKEPLIELFVK 

AGSDGESIGNCPFSQRLFMILWLKGVVFSVTTVD 

LKRKPADUJMAPGTHPPFITFNSEVKTDVNK^ 

FLEEVLCPPKYLKI^PKHPESNTAGMDIFAKFSA 

YIKNSRPEANEALERGLLKTLQKLDEYLNSPLPD 

EIDENSMEDIKFSTRKFLIX3NEMT^ 

fflVKWAKKYRNFDIPKEMTGIWRYLTNAYSRD 

EFTNTCPSDKEVEIVAYSDVAKRLHQVKSRLLKE 

VSFMSSP 


3001 


A 


779 


2006 


LALTFRSALSTLPGSPMTSSGSPDLQLAWGPSLLP 

HPPSVWSPALPSCFAGPCPLLPLSDTQGWWGPN 

WIAPPSAALCRPDAAVWPDLPSSNILLVTPPPAK 

*SAVAV*PCPRGAHSLERAARQYT1SGSSTSQSGK 

CSKRDTKCCAVTTSWGCFWQKHWKGDEDSGW 

AFQEGSHLGEGHL 


3002 


A 


909 


2799 


VEEAWTVWLHWGVRECLLEEETNQKEEAASSN 

WTKARGPFWQEDWVWDMRLKMTTRNFPEREV 

PCDVEVERFTREVPCLSSLGDGWDCENQEGHLR 

QSALTLEKPGTQEAICEYPGFGEHLIASSDLPPSQ 

RVJJtTOGFHAPDSr^ : 

KRTGL > O/^GKGF; JHSMEVIHGKNi cuIIL ~iCY 

PESVKSf'T^HFreiXJHQKIMKRGKKSYBG INFENI 

FTLSSSLNENQRNLPGEKQYRCTECGKCrl'F NSS 

LVLHHRTHTGEKPYTC^CGKSFSKNYNLIViiQ 

RIHTGEKPYECSKCGKAFSDGSALTQHQRIHTGE 

KPYECLECGKTFhnWSSLOJHQRTHTGBKPYRCN 

ECGKPFTDISHLTVHLRIHTGEKPYECSKCGKAF 

RIXjSYLTQHERTHTGEKPFECAECGKSFNRNSHL 

IVHQKIHSGEKPYECKECGKTFIESAYLIRHQRIH 

TGEKPYGC^QCQKLFRNIAGLIRHQRTHTGEKPY 

ECWQCGKAFRDSSCLTKHQRIHTKFIPYQCPECG 

KSFKQNSHLAVHQRLHSREGPSRCPQCGKMFQK 

SSSLVRHQRAHLGEQPMET*WLGAT*VFQFTLTP 

VFRRRVLDLTPLWSVEKNPLSYPVN 


3003 


A 


2 


1489 


SLTEHLSFFQPTAHSLTSLLGTMTTCSRQFTSSSS 
MKGSCGIGGGIGGGSSRISSVLAGGSCRAPSTYG 
GGLSVSSRFSSGGACGLGGGYGGGFSSSSSFGSG 
FGGGYGGGLGAGFGGGLGAGFGGGFAGGDGLL 
VGSEKVTMQNLNDRLASYLDKVRALEEANADL 

13\r^TO^T17Vr/%T>/^T*T*OTTT1/'T%\/'Cty\/tV'TTI3T\T t>XTVTTA 

tiViSJKU VV I yKQKJPSElKD Y or Yr K 1 JUbULKiNKUA 

ATDSNAQPELQIDNARLAADDFRTKYEHELALRQ 
TVEADVNGLRRVLDELTLARTDLEMQIEGLKEE 
LAYLRKNH*EEMLALRGQTGGEVNVETDAAPG 
VDI^CILNEMRNQYEQMAEKNRRDAETWFLSKT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nocleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystrine, D=Aspflrtk Add, 
E=G!utaraic Add, ^Phenylalanine, G=Glydne, H^HIstidlne, 
Msoleudne, KpLysine, L=*Leudne, M^ethkraine, 
N^Asparagioe, P*=Prollne, Q=G]utamlue, R=Arginine, S=SeriDt, 
^Threonine, V»Valine, W~Tryptopban, Y-Tyrosine, 
X=Unknown, *-Stop codon, /-possible nucleotide deletion, 
\=possib!e nudeotide insertion 










EELNKEVASNSELVQSSRSEVTELRRVLQGLEIEL 
QSQLSMKASLENSLEETKGRYCMQLSQIQGLIGS 
VEEQLAQLRCEMEQQSQEYQILLDVKTRLEQEIA 
TYRRLLEGEDAHLSSQQASGQSYSSREVFTSSSSS 
SSRQTRPILKEQSSSSFSQGQSS 


3004 


A 


2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARGGCTFK 

DKVLVAARRNASAWLYNEERYGNITLPMSHAG 

TGNIVVIMISYPKGREI1JELVQKGIPVTMTIGVGT 

RHVQEFISGQSVVFVAIAFITMMnSLAWLIFYYIQ 

RFLYTGSQIGSQSHKKETKKVIGQUXHTVK^ 

KGIDVDAENCAVCIENFKVKDinUIJPCKHIFHRIC 

IDPW1XDHRTCPMCKLDVIKALGYWGEPGDVQE 

MPAPESPPGRDPAANLSLALPDDDGSDESSPPSA 

SPAESEPQCDPSFKGDAGENTALLEAGRSDSRHG 

GPIS 


3005 


A 


184 


2552 


TMTMQF1XLFLFWVCIJPHFCSPEIMFRRTPVPQQ 

RIl^SRVPRSDGKILHRQKRGWMWNQFFLLEEY 

TGSDYQYVGKLHSDQDKGDGSLKYILSGDGAGT 

IJF1IDEKTGDIHATRRIDREEKAFYTLRAQAINRR 

TLRPVEPESEFVIKIHDINDNEPTFPEErVTASVPE 

MSWGTSWQVTATDADDPSYGNSARVIYSILQ 

GQPYFSVEPETGHRTALPNMNRENREQYQVVIQ 

AKDMGGQMGGI^GTTTVNITLTDVNDNPPRFPQ 

NTfflDLRVLESSPVGTAIGSVKATDADTGKNAEVE 

YRIDXjDGTDMFDIVTEKDTQEGIITVKKPLDYES 

RRLYTLXVEAEhnHVDPRFYYLGPFKDTTIVKISI 

EDVDEPPVFSRSSYlJEVHEDffiVGTnGTVMARD 

PDSISSPIRFSLDRHTDLDRIFNIHSGNGSLYTSKP 

LDRELSQ WHNLTVIAAEINNPKETTRVAV FVRIL 

DANDNAPQFAVFYDTFVCENARPGQLIQTISAVD 

TRKNGFN,^ ui: 1 77XPVViSDls D YPIQSSTGTLn 

RVCACDSQG' ". MQSCSAEALLLPAGLSTG ALIAIL 

LCHILLVIWLi 7 ^^UCRQRKKEPLILSKEDIRDNIV 

SYNDEGGGEEMQAFDIGTLRNPAAIEEKKLRRD 

IIPBTIJTPRRTPTAPDNTDVRD 

TAPPYDSLATYAYEGNDSIAESLSSLESGTTEGD 

QNYDYLREWGPRFNKLPQKYGGGESDKDS 


3006 


A 


2 


541 


GRVDKTWWGKSVGIMLTELEKALNSIIDVYHKY 

SLIKGWHAVYRDDLKKLLETECPQYIRKKGAD 

VWFKELDINTDGAVOTQEFLILVIKMGVAALNSII 

DVYHKYSLKGNFHAVYRDDLQKLLETECPQYI 

RKKGADVWFKELDINTDGAVNFQEFLILVIKMG 

VGSPQKKVASYF 


3007 


A 


1 


1253 


MYEGIRCLLKALLGFVSLAIGTLYCPRQYRPFPG 

SLGEEAINVPEPIPDSYYRDMATWPTrlAPSVEEG 

GQGRFGNQADHFLGSLAFAKLLNRSLAVPSWIE 

YQHHKPPFTNLHVSYQKYFKLEPLQAYHRVISLE 

DFMEKIAPTHWPPEKRVAYCFEVAAQRSPDKKT 

CPMKEGNPFGPFWDQFHVSFNKSELFTGISFSAS 

YREQWSQRrePKEHPVLAIJ'GAPAQFPVL^H^ 

UJKYMVWSDENfVKTGEAQIHAHLVRPYVGIHL 

RIGSDWKNACAMLKDGTAGSHFMASPQCVGYS 

RSTAAPLTMTMCLPDLKEIQRAVKLWVRSLDAQ 

SVYVATDSESYVPELQQLFKGKVKWSLKPEVA 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«=Aianine OCysttine, D^Aspartic Add, 
fXSIutamic Add, ^Phenylalanine, G*=Gtydne, H=Hlstidine, 
I^Isoleudne, K«Lysine, L=Lcudne, MsMethlooine, 
N=»Asparagine, F^Proltoe, Q=Glutamine, R-Arginine, S=Serine, 
T«Tbreooine, V=Valint, W«Tryptophan, Y-Tyrosine, 
X=Un known, **»Stop eodon, /^possible nudeotide deletion, 
V=possible nudeotide insertion 










QVDLYILGQADHFIGNCVSSFTAFVKRERDLQGR 
PSSFFGMDRPPKLRDEF 


3008 


A 


3136 


1898 


TARGGGSEPGPTMAANYSSTSTRREHVKVKTSS 

QPGFLERLSETSGGMFVGIMAFLLSFYLIFTNEG 

RALKTATSLAEGLSLWSPDSIHSVAPENEGRLV 

HEGALRTSKLLSDPNYGVHLPAVKLRRHVEMY 

QWVBTEESREYTEDGQVKKETRYSYNTEWRSEII 

NSKNFDREIGHKNPRAMAGESFMATAPFVQIGRF 

FLSSGLroKVDNFKSLSLSKLEDPHVDIIRRGDFF 

YHSENPKYPEVGDLRVSFSYAGLSGDDPDLGPA 

HVVTYIARQRGIXJLWFSTK5GDTLIXLHHGDFS 

AEEVFHRELRSNSMKTWGLRAAGWMAMFMGL 

NLMTRILYTLVDWFPWRDLVNIGLKAFAFCVAT 

SLTLLTVAAGWU^YRPLWALUAGLALVPILVAR 

TRVPAKKLE 


3009 


A 


93 


659 


DAAVAMTAQGGLVANRGRRFKWAIELSGPGGG 

SRGRSDRGSGQGDSLYPVGYLDKQVPDTSVQET 

DRILYEKRCWDIALGPLKQIPMNLF1MYMAGNTI 

SIITTMMVCMMAWRPIQALMAISATFKMLESSS 

QKFLQGLVYUGNIJVIGLAI^VYKCQSMGLLPTH 

ASDWLAFIEPPERMEFSGGGLLL 


3010 


A 


2 

... . * - ■ * ;* 


1041 


LIDSAKARYWTQRGTWVYDNALLLLLKCLWSN 

VWECTMASSNTVLMRLVASAYSIAQKAGMIVR 

RVIAEGDLGIVEKTCATDLQTKADRLAQMSICSS 

LARKFPKLTIIGEEDLPSEEVDQELIEDSQWEEILK 

QPCPSQYSAIKEEDLWWVDPLDGTKEYTEGLL 

DNVTVLIGIAYEGKAIAGVINQPYYNYEAGPDAV 

LGRHWGVLGLGAFGFQLKEVPAGKHliri'lKSH 

SNKLVTDCVAAMNPDAVLRVGGAGNKEQLIEG 

KASAYWASPGCKKWDTCAPEVILHAVGGKLTD 

IHGNVLQYHKDVKHMNSAGVI /:~JRNYD*'YAS 

RVPESIKtf ALVP . ; 


3011 


A 


29i'V 


1452 


SPQKTMRSHTrTMTlTSVSSWPYSSHRMRFlTNH 
SIXJPPQWSATPNVTTCPMDEKIXSTVLTTSYSV^ 
FIVGLVGl^IIALYVFLGIHRKKNSIQIYLLNVAIAD 
LLIJFCl^FRIMYHINQNK 

MNMYTSIILLGFISLDRYIKINRSIQQRKAnTKQSI 

YVCCIVWMLALGGFLTMnLTLKKGGHNSTMCF 

HYRDKHNAKGEAIFNhlLVVMl' WLIFIXIILSYIKI 

GKmLRISKRRSKFPNSGKYATTARNSFIVUIFTI 

CFVPYHAITaFIYISSQLNVSSCYWKEIVHKTNEIM 

LVLSSFNSOJ)PVMYFI24S^^ 

QGEPSRSESTSEFKPGYSLHDTSVAVKIQSSSKST 


3012 


A 


246 


1346 


TEPVGYTKAEEPIAMRSLGALLLLLSACLAVSAG 

PVPTPPDNIQVQEhDFNISRIYGKWYNLAIGSTCPW 

LKKIMDRMTVSTLVLGEGATEAEISMTSTRWRK 

GVCKKl'SGAYEKTDTDGKFLYHKSKWNITMESY 

VVHTNYDEYAIFLTKKFSRHHGPTITAKLYGRAP 

QLRETLLXJDFRWAQGVGIPEDSIFIMADRGECV 

PGEQEPEPILPRVRRAVLPQEEEGSGGGQLVTEV 

X Akft-Bl/ol^l^lAJ I or\\Jx V'lYlVJIVl 1 OxV. If IINVJ I olYlAv^ 

ETFQYGGCMGNGNNFVTEKECLQTCRTVAACN ' 

LPIVRGPCRAFIQLWAFDAVKGKCVLFPYGGCQ 

GNGNKFYSEKECREYCGVPGDGDEELLRFSN 


3013 


A 


67 


379 


RQMAIXKANKDLISAGLKEFSVLLNQQVFNDPL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteint, D=Aspartk Add; 
E=Glutamic Add, F-Phenylalanine, G=GIydne, H=Hlstidine, 
I-Isoieudne, K«Lyslne, L=Leudne» M^Metbtonlne, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginine, S=«Scrine, 
Threonine, V=VaItoe, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /^possible nudeotide deletion, 
V=possible nudeotide insertion 










VSEEDMVTVVEDWNINFYINYYRQQVTGEPQER 
DKAIXJELRQELNTLAWFLAKYIUJFIJCSHELPSH 

DDDGC 


3014 


A 


1 


373 


GTSWSTLRAVMSASWSVVSRVLEEYLSSTPQRL 
KLI^AYLLYILLTGALQFGYCLFVLTTHFNSLLLF 
FFFCVGSFHSNVYFLLFIX^FLCFLFIAYFFLIIOTS 
LFIWFFHVFFIELSLFYF 


3015 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRIJ^LTKTVKDAVQKNSEKYLS 

ELAEQPEIUCITRNQBCRKHDEINHVQKTYAEMDP 

TTAALEKEHEAJTKVKYVDKIHIGNYEIDAWYFS 

PrTEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHTV GYFSKEKESPDGNNVACILTLPPYQRRG YG 

KFLJAFSYEI^KLESTVGSPEKPLSDLGKLSYRSY 

WSWVLLEEJIDFRGTLSIKDLSQMTSITQNDnST 

LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 

KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 

AVT 


3016 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRIJVLTKTVKDAVQKNSEKYLS 

EIAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPraWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHTV GYFSKEKESPDGNNVACILTLPP YQRRGYG 

KFL * AFSYFT ^ T ^STVG£^cLPl ^PLGKLSY$5 Y 

ws w vlli ^lfj:: ^r^xsnu)LSQMTsrrQMinsT 

LQSLNMVKV YKGQHVI(^/TPKLVEEHLKSAQY 
KKPP1TGGWCAAVCRGRWGSVSIWTGRSQGLLI 
AVT 


3017 


A 


38 


704 


EAHPGGQLG SERNG VRMDED VLTTLKILCGESG 

VGKSSLLIJlrTT)DTFDPELAATIGVDFKVKTISVD 

GNKAM^WDTAGQERFRTLTPSYYRGAQGVl^ 

VYDVT1UU)TFVKU)NWLNELE^ 

LVGNKTOKENTOV^^ 

AK 1 CLHj VQCAFEEL VJEKIIQTrXjL WESENQNKG 
VKLSHREEGQGGGACGGYCSVL 


3018 


A 


2640 


2861 


APVLILQMVKLSlVLTPQFLSmKJGQLTKELQQH 
VKSVTCPCEYLRKVSECRQMGPGALEQFPGLSC 
HTSHSG 


3019 


A 


1307 


711 


PGITMAASLVGKKJVFVTO^ 

FPCIlVAQKIDLPEYQGEPDEISIQKC 

QGPVLVEDTCLCFNALGGLPGPYIXWl^EKLKPE 

GLHQLLAGFEDKSAYALCTFALSTGDPSQPVRLF 

RGRTSGRIVAPRGCQDFGWDPCFQPDGYEQTYA 

EMPKAEKNAVSHRFR AT T FT OFYFfiSLAA 


3020 


A 


1202 


180 


VSCLI^CKMTILIWQDQPVPWSSHPDEYKIAA 
LVl^SCIFnGLFVNITALWWSCTTC 
1V11WALVDLIF1MTLPFRMFYYAKDEWPFGEYFC 
QILGALTVFYPSIALWLIJ^ADRYMAIVQPKY 
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SEQID 
NCh 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, D=Aspartic Add, 
E=€luramlc Add* Phenylalanine, G^GIydne, H«Hisridinc, 
I-Isoleudne, K^Lysue, L=Leudne, M»M etnlonine, 
N=Asparagine, P=ProIine, Q^Glutamine, R=Arginine, S=Serine, 
T^Threonroe, V=Valiue, W-Tryptophan, Y-Tyrosine, 
X^Unknown, *=»Stop eodon, A=possible nudeotide ddetion, 
\=posslble nudeotide insertion 






* 




AKEIJKOTCKAVLACVGVWIMTLTTTTPLLLLYK 

DPDKDSTPATCLKISDUYLKAVNVLNLTRLTFFF 

IJPIiTMIGCyLVIIHNlXHGRTSKLKPKVKEKS 

IITIXVQVLVCFMPFfflCFAFLMLGTGENSYNPW 

GAFTTFLMNl^TCLDVILYYTVSKQFQARVISVM 

LYRNYLRSMRRKSFRSGSLRSLSN1NSEML 


3021 


A 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQT 
KRKKPRRYWEEETVTTTAGASPGPPRNKKNREL 
RPQRPKNAYILKKS1USKKPQVPKKPREWKNPES 
QRGLSGAQDPFPGPAPVPVEVVQKFCRIDKSRKL 
PHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEE 
PGFLEGEDGEDTAKICQADIVEAVDIASAAKHFD 
LNLRQFGPYRLNYSRTGRHLAFGGRRGHVAALD 
WVTKK^CEIb^MEAVRDIRFLHSEALLAVAQN 
i iRmiHYDNQGIELHCIRRCDRVTRl^FIJFHFUJ^ 
TASETGFLTYLDVSVGlOVAALNARAGRLDVMS 
QNPYNAVIHLGHSNGTVSLWSPAMKEPLAKILC 
HRGGVRAVAVDSTGTYMATSGLDHQLKIFDLRG 
TYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDVV 
NIWAGQGKASPPSLEQPYLTHRLSGPVHGLQFCP 
FEDVLGVGHTGGITSMLVPGAGEPNFDGLESNPY 
RSRKQRQEWEVKALLEKVPAEL1CLDPRALAEV 
DVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKG 
RSSTASLVKRKRKVMDEEHRDKVRQSLQQQHH 
KEAKAKPTGARPSALDRFVR 


3022 


A 


1 


2249 


MTAQDSNTSAHAQRDGPELPASSSWRSFWPLSC 

LSSPPVSAVEVATEGRDREVAKVGQRFCDTTSGE 

LRQARDRDCCVRMPAPVGRRSPPSPRSSMAAVA 

LRDSAQGMTFEDVAIYFSQEEWELLDESQRFLYC 

DVMLENFAHVTSLGYCHGMENEAIASEQSVSIQ 

V?TSKG\TPTOKTHLSFTr^ f EH 

QTTSPVQKSYLGST;?... ?^FCFSA1^LHQHQKHYN 

EEEP WKRKVDEATFV TGCRFHVLNYFTCGEAFP 

APTDLLQHEATPSGEEPHSSSSKHIQAFFNAKSYY 

KWGEYRKASSHKHTLVQHQSVCSEGGLYECSK 

CTKAFTCKNTLVQHQQIHTGQKMFECSECEBSFS 

KKCHLILHKIIHTGERPYECSDREKAFIHKSEFIHH 

QRRHTGGVRHECGECRKTFSYKSNLIEHQRVHT 

GERPYECGECGKSFRQSSSLFRHQRVHSGERPYQ 

CCECGKSFRQIFNLIRHRRVHTGEMPYQCSDCGK 

SFSCKSELIQHQRIHSGBRPYECRECGKSFRQFSN , 

LIRHRSIHTGDRPYECSECEKSFSRKFILIQHQRVH 

TGERPYECSECGKSFTRKSDLIQHRRIHTGTRPYE 

GSECGKSFRQRSGLIQHRRLHTGERPYECSECGK 

SFSQSASLIQHQRVHTGERPYQCCECGKSFRQIFN 

URHRRVHTGEMPYQCSDCGKSFSCKSELIQHRRI 

HSGERPYECSECGKSFSRKSNLIRHRRVHTEERP 


3023 


A 


3148 


634 


AAGALRCLAAFPRAEPASRGRQSSPARACAASR 

AERATAAAMAHRCLRLWGRGGCWPRGLQQLL 

VPGGVGPGEQPCLRTLYRFVTTQARASRNSLLTD 

IIAAYQRFCSRPPKGFGKYFPNGKNGKKASEPKE 

VMGEKKESKPAATTRSSGGGGGGGGKRGGKKD 

DSHWWSRFQKGDIPWDDKDFRMFFLWTALFWG 

GVMFYIJ.LKRSGRE1TWKDFVNNYLSKGVVDRL 

EVVNKRFVRVTFTPGKTPVDGQYVWFNIGSVDT 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

actd residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteiae, I>=Aspartic Add, 
E=Clutamic Add, ^Phenylalanine, G=Gtydne, HHHJstidine, 
I=Isolendne, K«=Lysine, IHLeudne, [Vt=Methk>nine, 
N=Asparagine, P^Protfnc, Q=G!utamJne, R«Argfnine, S=Serine r 
T»Threonine> V~Valine, W«Tryptophan, Y-Tyrosine, 
X=Unknown, ***Stop codon, /possible nucleotide deletion, 
V=posstble nucleotide insertion 










FERNLETLQQELGIEGENRVPWYIAESDGSFLLS 

MLPTVLIIAFLLYTIRRGPAAIGRTGRGMGGLFSV 

GETTAKVIJCDEIDVKFKDVAGCEEAKLEIMEFV 

NFUQ^KQYQDLGAKIPKGAILTGPPGTGKTLLA 

KATAGEANVPFITVSGSEFLEMFVGVGPARVRDL 

FALARKNAPCILFIDEIDAVGRKRGRGNFGGQSE 

QEbnXNQIiVEMDGFNTTTNVVILAGTNRPD^ 

PAUJ^GRFDRQMGPPDIKGRASIFKVHLRPLKL 

DSTLEKDKLARKLASLTPGFSGADVANV CNEAA 

LIAARHLSDSINQKHFEQAIERVIGGLEKKTQVLQ 

PEEKKTVAYHEAGHAVAGWYLEHADPLLKVSII 

PRGKGLGYAQYLPKEQYLYTKEQLLDRMCMTL 

GGRVSEEIFFGRITTGAQDDLRKVTQSAYAQIVQ 

FGMNEKVGQISFDLPRQGDMVLEKPYSEATARLI 

DDEVRILINDAYKRTVALLTEKKADVEKVALLL 

LEKEVXJDKITOMVEIXGPRPFAEKSTYEEFVEGT 

GSIJDEDTSLPEGIJCDWNKEREKEKEEPPGEKVA 

N 


3024 


A 


274 


1455 


LRACSLPSMSALEKSMHLGRLPSRPPLPGSGGSQ 

SGAKMRMGPGRKRDFSPVPWSQYFESMEDVEV 

ENETGKDTFRVYKSGSEGPVLLLLHGGGHSALS 

WAVFTAAHSRVQCRIVALDLRSHGETKVKNPED 

LSAETMAKDVGNWEAMYGDLPPPIMLIGHSMG 

GAIAVHTASSNLVPSLLGLCMTOVVEGTAMDAL 

NSMQNFLRGRPKTFKSLENAIEWSVKSGQIRNLE 

SARVSMVGQVKQCEGITSPEGSKSIVEGIIEEEEE 

DEEGSESISKRKKEDDMETKKDHPYTWRIELAKT 

EKYWDGWFRGLSNLFLSCPIPKLLLLAGVDRLD 

KDLTIGQMQGKFQMQVLPQCGHAVHEDAPDKV 

AEAVATPLIRHRFAEPIGGFQCVFPGC 


i 

i 


A 


621 


306 


YHCH^.^RAGGSFRS^ 7 ?^^^Li^ ^FTT^ST, 
SWKGi-SSLLFPLYNLQ-sA jmsIZ:^ ?JCELGRGK^PP 
HLEGPHMLPSGAARWRWJ EAPVLVLEPLVLRPA 
AAPTP 


3026 


A 


1533 


454 


AKVPQSTREEKRENGLEARSPAT.SLMGFNVEEM 

YEAHAWIQRILSLQNHHIIENNHILYLGRKEHDIL 

SQLQKTSSVSITEHSPGRTELEIEGARADLffiWM 

NIEDMLCKVQEEMARKKERGLWRSLGQWTIQQ 

QKTQDEMKENIIFLKCPVPPTQELLDQKKQFEKC 

GLQVIXVEKTONEVLMAAFQRKKKMMEEK^ 

QPVSHRLFQQVPYQFCNWCRVGFQRMYSTPCD 

PKYGAGIYFIIOTJCNLAEKAKKISAADKLIYVFE 

AEVLTGFFCQGHPLNIVPPPLSPGAIDGHDSVVD 

NVSSPETFVIFSGMQAIPQYLWTCTQEYVQSQDY 

SSGPMRPFAQHPWRGFASGSPVD 


3027 


A 


179 


703 


PFHIXjASSNTFRLQVQTQESKAQKEVKMGFIFSK 
SMNESMKNQKEFMLMNARLQLERQLIMQSEMR 
ERQMAMQIAWSREFLKYFGTFFGLAAISLTAGAI 
KKKKPAFLWIWl^FILTYQYDLGYGTLLERMK 
GEAEDILETEKSKLQ1PRGMITFESIEKARKEQSR 
FFTDK 


3028 


A 


876 


1226 


AVGKEPESSSTWVRDREGHIRSRRSMKMLWKLT 
DNIKYEDCEVSATPARSSVRSQAPSLTLPLLLLSL 
QPAAKRGWDKLSPAQRPSLGFARRTRGRSCRER 
TWMLPSLVSEFLHRD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=-Aspartfc Add, 
E^GIutamic Add, ^Phenylalanine, OGIydne, H»Hlstidine, 
I=l5o!cudne, K»Lyslne, L=Leudne, M=*Methiontoe, 
N=Asparagine, P=Proline, Q^GIutnmine, R-Arginine, S=Serine, 
T=Tbreonine, V=Vaiine, W=Tryptophan, Y^Tyrosine, 
X=Unkoown, **=Stop eodon, /^possible nudeodde ddetion, 
V^possible nucleotide insertion 


3029 


A 


3 


1731 


FREGRFGSSCAVAAPLAGFQGLIECGYLAVDSPP 

SCWTPGGSNPAAPLPQALLPPRLPPTVLPFLGPGL 

SGELEMFTLPQKDFRAPTTCLGPTCMQDLGSSHG 

EDl^GECSRKLDQKLPELRGVGDPAMISSNTSYL 

SSRGRMIKWFWDSAEEGYRTYHMDEYDEDKNP 

SGHNLGTSENKLCFDLLSWRLSQRDMQRVEPSL 

LQYADWRGHLFLREEVAKFLSFYCKSPVPLRPE 

NVVVLNGGASLFSAIJVTVLCEAGEAFLIPTPYYG 

AITQHVCLYGNIRI^YVYIJDSEVTGLDTRPFQLT 

VEKLEMALREAHSEGVKVKGLIL1SPQNPLGDVY 

SPEELQEYLWAKRHIU.HVIWEVYMLSVFBfCSV 

GYRSVLSLERLPDPQRTHVMWATSKDFGMSGLR 

FGTLYTENQDVATAVASLCRYHGLSGLVQYQM 

AQLLRDRDWINQVYl^ENHARLKAAHTYVSEEL 

RALGIPFLSRGAGFFIWVDIJIKYLLKGTFEEEML 

LWRRFLDNKVLLSFGKAFECKEPGWFRFVFSDQ 

VHRLCLGMQRVQQVLAGKSQVAEDPRPSQSQEP 

SDQRR 


3030 


A 


1 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMAVST 

VFSTSSLMLALSRHSLLSPLLSVTSFRRFYRGDSP 

TDSQKDMIEIPLPPWQERTDESIETKRARLLYESR 

KRGMLENCILLSLFAKEHLQHMTEKQLNLYDRLI 

NEPSNDWDIYYWATEAKPAPEIFENEVMALLRD 

FAKNKNKEQRLRAPDLEYLFEKPR 


3031 


A 


1177 


359 


SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSL 

CWSSCGQHPVQATHRGAVSNSLMLCILKLASQM 

PLENTI^QQMVFMLLSNLALSHDCKGVIQKSNF 

LQNFLSLALPKGGNKHLSNLTILWLKLLLNISSGE 

DGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLI 

FHNVCTSPANKPKIIAb^KYITVLAACLESENQN 

AQRIGA A AT^WALIYK Y QKA KTALKS p SVKRRY- ~ : 

EAYSIAKKlFPNSEAOTl^AYYLl: "vLiMLVQU- 

NSS 


3032 


A 


2 


1242 


GISGRPPRPAKRRMGKNPVRPPRALPPVPSQDDEP 

LSRPKKKKPRITCNTPASASLEGLAQTAGRRPSEG 

NEPSTKELKEHPEAPVQRRQKKTOLPLELETSST 

QKKSSSSSLLKNENGIDAEPAEEAVIQKPRRKTK 

K1QPAELQYANELGVEDEDIITDEQTTVEQQSVF 

TAPTGISQPVGKVFVEKSRRFQAADRSELIKTTEN 

roVSMDVKPSWTTRDVALTVHRAFRMIGLFSHG 

FIAGCAVWNIVVIYVLAGIXJLSNLSNLLQQYKT 

LAYPFQSLLYLIJJ^LSTISAFDRIDFAKISVAIRNF 

I^DPTAIASFLYFTALII^SQQMTSDRIHLYTP 

SSVNGSLWEAGIEEQILQPWIVVNLWALLVGLS 

WLFLSYRPGMDLSEELMFSSEVEEYPDKEKEIKA 

SS 


3033 


A 


3 


1436 


TATSGGIWUOOCWCHWPRPLPQSCVGTEGGLQ 

VRDTSSRIAKGGVDHTKMSLHGASGGHERSRDR 

RRSSDRSRDSSHERTESQLTPCIRNVTSFTRQHHV 

EREKDHSSSRPSSPRPQKASPNGSISSAGNSSRNS 

SQSSSDGSCKTAGEMVFVYENAKEGAR>JIRTSER 

VTLIVDNTRFVVDPSIFTAQPNTMLGRMFGSG^ 

HhnTTRPNEKGEYEVAEGIGSTVFRAILDYYKTGII 

RCPDGISIPEUIEACDYLC1SFEYSTIKCRDLSALM 

HEl^NDGARRQFEFYLEEMILPLMVASAQSGERE 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A»Alanine OCystetne, D=Aspartk Add, 
E^GIuteraic Add, ^Phenylalanine, G=Giydne, H=Histidtne, 
Msoieudne, K=Lysine, L=Leudne, M=Methionine, 
r*=Asparagine, P=Proline, Q^GIutamine, R~Argf nine, S^Serine, 
T=Threonine, V« Valine, W-Tryptophan, Y^Tyrosine, 
X«Unknown, * E =Stop codon, ^possible nudeotide ddetion, 
\=possib)e nudeotide Insertion 










CHWVLTDDDVVDWDEEYPPQMGEEYSQIIYSTK 

LYRFFKYIENRDVAKSVLKERGLKKIRLGIEGYP 

TYKEKVKKRPGGRPEVIYNyVQCTFIRMSWEKE 

EGKSRHVDFQCVKSKSrTNLAAAAADIPQDQLV 

VMHPTPQVDELDILPIHPPSGNSDLDPDAQNPML 


3034 


A 


3 


1972 


SSLAQHRSVAVLGWPAGWAAARARPAMQGGN 

SGVRKREEEGDGAGAVAAPPAIDFPAEGPDPEY 

DESDVPAEIQVLKEPLQQPTFPFAVANQLLLVSL 

LEHLSHVHEPNPLRSRQVFKLLCQTFIKMGLLSSF 

TCSDEFS SIJU/HH>n^THLMRS AKERVRQDPCE 

DISRIQKIRSREVALEAQTSRYLNEFEELAILGKG 

GYGRVYKVKNKLIX3QYYAIKKILIKGATKTVCM 

KVLREVKVlJVGLQHPNWGYOT 

RADRAAIELPSLEVLSDQEEDREQCGVKNDESSS 

SSIIFAEPTPEKEKRFGESDTENQNNKSVKYTTNL 

VIRESGELESTLELQENGLAGLSASSIVEQQLPLR 

RNSHLEESFTSTEESSEENVNFLGQTEAQYHLML 

HIQMQLCELSLWDWIVERNKRGREYVDESACPY 

VMANVATKJFQELVEGVFYIH^GIVHRDLKPR 

NIFIJJGPIX^VKIGDFGLACTDILQKKroWTNR 

NGKRTPTHTSRVGTCLYASPEQLEGSEYDAKSD 

MYSLGVVLLELFQPFGTEMERAEVLTGLRTGQL 

PESLRKRCPVQAKYIQHLTRRNSSQRPSAIQLLQS 

ELFQNSGNVNLTLQMKIIEQEKEIAELKKQLNLL 

SQDKGVRDDGKDGGVG 


3035 


A 

• 


110 


1172 


KLSCPCSHGTRVTAVRGPRLKAGVQWHDLGSLQ 
PPPSGLKQSSHLSLSSSWDFRrlAPTHPETYTCPK 
MIEMEQAEAQLAELDLLASMFPGENELIVNDQL 
AVAELKDCmKXTMEGRSSKVYFTINMNLDVSD 
EKMAMFSLACILPFKYPAVLPErrVRSVLLSRSOQ 
TQLNTDLTAFL ' : ' 5CHGDVCILN*T^^REH/: 
GY * ;;:IU>TSSSF*; TGST VQS VDLb ^v>- ^UHxY 
NKCjOIKNIIJsWAKBL^^ 

PQSACEEFWARLRKLNWKJULIRHRjiDiPFDGTN 
DETERQRKFSIFEEKVFSVNGARGNHMDyGOLY 
QFLNTKGCGDVFQMFLWV 


3036 


A 


1 


2288 


frfaerraaaaesdvsakmagrsmqaarcptd 

elsltncawnekdfqsgqhvtvrtspnhrytft 

lkthpswpgsiafslpqrkwaglsigqeievsly 

tfdkakqqgtmtieidflqkksidsnpydtdkm 

aaehc^fnnqafsvgqqlvfsfneklfgllvkd 

ieamdpsilngepatgkrqkievglwgnsqvaf 

ekaensslnligkaktkenrqsiinpdwnfekmg 

iggldkef^difrrafasrvfppeiveqmgckitvrc 

gillygppck:gktllarqigkmlnarepkvvng 

peilnkyvgeseanirklfadaeeeqrrlgansg 

lhiiifdeidaickqrgsmagstgvhdtvvnqlls 

KIIXjVEQLNMLVIGMTbn^DLIDEALLRPGRLEV 

KMEIGLPDEKGRLQIUnHTARMRGHQLLSADV 

DIKEl^VETKNFSGAELEGLVRAAQSTAMNRHI 

KASTKVEVDMEKAESLQVTRGDFLASLENDIKP 

AFGTNQEDYASYIMNGIIKWGDPVTRVLDDGEL 

LVQQTKNSDRTPLVSVLLEGPPHSGKTALAAKIA 

EESNFPFIKICSPDKMIGFSETAKCQAMKKIFDDA 

YK^QI^CWVDDIERLLDYVPIGPRFSNLVLQAL 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alaninc OCystdne, D=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=Clydne, H=Histidine, 
I=Iso!eudne, K=Lysine, I^Leudne, M^Mcthionlne, 
N^Asparagioe, P=Proline, Q=Clutnnilne, R=Arginine t S=Scrine, 
T^Tbreonine, V»VaIine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=posstb!e nudeotide insertion 










LV1JJCKAPPQGRK1XUGTTSRKDVLQEMEMLNA 
FSTTIHVPNIATGEQIXEALEIXGOT^ 
QQVKGKKVWIGIKKLLMLIEMSLQMDPEYRVRK 
FLALLREEGASPLDFD 


3037 


A 


1 


1347 


MLDTGSEHLNRILKALPALQSAGSEGQNGSAESL 

GEGGTRDSDRAIUIKLRGGNKEIPTFYPCLVVRSP 

VTASDLRGTQDFAAYHGLSLILEPLGACNRLSVC 

VPVHSPPGMRVSPRSPSLRTLVIDPAEPAGAQRL 

RFSGKERSGEAGSAVEGLAVAVSMGDGGAERD 

RGPARRAESGGGGGRCGDRSGAGDLRADGGGH 

SPTEVAGTSASSPAGSRESGADSDGQPGPGEADH 

CRRILVRDAKGTIRErVLPKGLDLDRPKRTRTFFT 

AEQLYRLEMEFQRCQYWGRERTELARQLNLSE 

TQVKVWPQNRRTKQKKDQSRDLEKRASSSASEA 

FATSNILRLLEQGRLLSVPRAPSLLALTPSLPGLP 

ASHRGTSLGDPRNSSPRLNPLSSASASPPLPPPLP 

AVCFSSAPLLDLPAGYELGSSAFEPYSWLERKVG 

SASSCKKANT 


3038 


A 


924 


501 


TELLPLCSRSGPKPQSGDPLLQLAQQARPRLSGE 

RLETAPSLLLSRMACVISGWALSRGARTWTWAT 

PTGPVHRAQPAIRSLSAEGALTRLKEEKWPGRYI 

U>NHLTPPFLYKHLGSVPPSHWRSPLISHSVNILA 

LNWR 


3039 


A 


1263 

• 


111 


ACGIRHEGALPGLTATPEAMLRFLPDLAFSFLLEL 

ALGQAVQFQEYVFLQFLGLDKAPSPQKFQPVPYI 

LKK1FQDREAAATTGVSRDLCYVKELGVRGNVL 

RFLPDQGFFLYPKKISQASSCLQKLLYFNLSAIKE 

REQLTLAQLGLDLGPNSYYNLGPELELALFLVQE 

PHWGQTTPKPGKMFVLRSVPWPQGAVHFNLL 

DVAKDWNDNPRKNFGLFLEILVKEDRDSGVNFQ 

AlPWKl^CKNLCH/JiQL, 1:DLGW:IKWI1AP 
KGFMAl^CHGECPFSLTISLNSSNyAFMQALMH 
AVDPEIPQAVCIFTXLSPISMLYQDNNDNVILRHY 
EDMWDECGCG 


3040 


A 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFALYL 

LSTRLPRGRRLGSTEEAG GRSL WFPSDL AELREL 

SEVLREYRKEHQAYVFLLFCGAYLYKQGFAIPGS 

SFLNVLAGALFGPWLGLLLCCVLTSVGATCCYL 

LSSIFGKQLWSYFPDKVALLQRKVEENRNSLFF 

FLLFLRIJ^MTPNWIl.NLSAPILNIPIVQFFFSVLr 

GLIPYNnCVQTGSILSTLTSLDALFSWDTVFKLL 

AIAMVALIPGTLIKKFSQKHLQLNETSTANHIHSR 

KDT 


3041 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 

LSCISVDSRIVRTKVPCSV1MSRPRKRLAGTSGSD 

KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

C1JCNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQAROTLRAMKLGEEAFFY 

HSNCKEPGIAGl^lKIVKEAYPDHTQFEKNNPHY 

DPSSKED>TPKWSMVDVQr^RMMKRFIPlj\ELK5 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3042 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
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SEQJQ) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alanine OCystdne, D-Aspartic Add, 
E=GlDtemic Acid, ^Phenylalanine, G=Grydne, H=Histidine, 
I=Isoleudne, K«Lyslne, LHLeudne, M=Methionine, 
N^Asparagine, P°ProJine, Q=Glutamlne, R«Arginfne, S*=Serine, 
l^Tfareonlne, V«Valine, W=»Tryptophan, Y=Tyrosine» 
X»Unknown, *=Stop codon, /^possible nudeotide deletion, 
^possible nucleotide insertion 










KGLSGKRTKTENSGEALAKVEDSNPQKTSATECN 

CLKNI^SHWLMKSEPESRl^KGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQA1WFLRAMKLGEEAFFY 

HSNCKEPGIAGLMK1YKEAYPDHTXJFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3043 


A 


153 


1133 


VGTAPAPGGRDRAPAMGSFQLEDFAAGWIGGA 

ASVIVGHPLDTVKTRLQAGVGYGhrrLSCIRVVy 

RRESMFGFFKGMSFPLASIAVYNSWFGVFSNTQ 

RFLSQHRCGEPEASPPRTLSDLLLASMVAGWSV 

GUjGPVDLIKIRLQMQ^PFRDANLGLKSRAVAP 

AEQPAY(^PVHCITTIVRNEGLAGLYRGASAML 

LRDVPGYCLYFIPYVFLSEWITPEACTGPSPCAV 

WIJVGGMAGAISWGTATPMDVVKSRLQADGVY 

LNKYKGVLDCISQSYQKEGLKVFFRGITVNAVR 

GFPMSAAMFLGYELSLQAIRGDHAVTSP 


3044 


A 


41 


1316 


PPLGAGAGEHARSPHPARRLRLTAAGVGGRASG 

LLPTPWRRHHGPSGAAPYPAARLWQGPWRCRR 

PQPMAQRYDELPHYPGIADGPAALAGFPEAVPA 

APGPYGPHRPPQPLPPGLDSDGLKRDKDEIYGHP 

LFPLLALGFEKCELATCSPRDGAGAGLGTPRGGD 

VCSSDSFNEDNTAFAKQVCSERPFSSNPELDNLM 

IQAIQVLRFHLLELEKGKMPIDLVTEDRDGGCRE 

DFEDYPAPCPSLPDQNNIWIRDHEDSGSVHLGTP 

GPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGE 

DEDLDQEPRRNKKRGIFPKVATNIMRAWLFQHL 

SHPYPSEEQKKQIAQDTGLTILQVNNWFINARRR 

IVQPMIDQSNRTGQGAAFSPEGQPIGGYTETEPH 

VAFRAPASVXjMSLNSEGEWHYL 


30/ " 


A 


3 


967 


VAHTQWK? V -\£T £ QI^THRriLirT? I HTHA 2wV 

PSPRWGQTFEGIJPAASPCGPG^ 

WGMI^CLCTVLWHLPAVPALNRTG^PGPGPSIQ 

KTYDLTRYLEHQI^IAGTYLNYLGPPFl^DFN 

PPRLGAETLPRATVDJ^VWRSOsTOKLRLTQNYE 

AYSHL1XYLRGLNRQAATAELRRSLAHFCTSLQ 

GIXGSIAGVMAALGYPLPQPLPGTEPTWTPGPAH 

SDFLQKMDDFWIJJCELQTWLWRSAKDFNRLKK 

KMQPPAAAVTLHLGAHGF 


3046 


A 


1185 


1584 


MYAYMYlCTHICICAYRGIHIDVYLYMCIYIrnWI 
HTYLCVHTYVYVYICTfflCMCIHTYVY^ 
VYTYIOXAT^CIX:VHIYLC^^ 
IHTYVHMCICVYIHMYTCVYVYTYTCVYNfY 


3047 


A 


811 


132 


SLDLLGPIGILQEGRDPGTQGPQEKEKQMPASPM 

NTDAHLDINFKEGUCKERSYTGQFEANVRDEER 

QCGCGWPDSLLMKVLSQRLDQQDCIQKGWVL 

HGWRDIJ3QAHLLNRLGYNPNREFFLNVPFDSI 

MERLTLRRIDPVTGERYHLMYKPPPTMEIQARLL 

QNPKDAEEQVKLKMDLFYRNSADLEQLYGSAIT 

LNGDQDPYTVFEYIESGIINPLPKKIP 


3048 


A 


2 


1166 


RPRRGQGLVQEVQTENVTVAEGGVAEITCRLHQ 
YIXjSIVVIQNPARQTLFFNGTRALKDERFQLEEFS 
PRRVRIRLSDARLEDEGGYFCQLYTEDTHHQIAT 
LTVLVAPENPWEVREQAVEGGEVELSCLVPRSR 
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S£QH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
seqaence 


Amino add sequence (A«Alanine OCysteine, D=Aspartic Add, 
^-Glutamic Acid, F=Ph^nyl alanine, G-Glydse, H^Histidine, 
I=Isoleudne, K«Lysine, L=Leucine, M=MethJonlne, 
N=Asparagine, P«ProIine, Q=dutamlne, R=Arginine, S^Serine, 
TVThreonlne, V^Valine, W=Tryptopban, Y^Tyrosine, 
X=Unknown, *«Stop codon, ^possible nudeotide deletion, 
V-possible nudeotide insertion 










PAATLRWYRDRKELKGVSSSQENGKVWSVAST 

VRFRVDRKDDGGmCEAQNQALPSGHSKQTQYV 

U)VQYSPTAIUHASQAVVREGDTLVLTCAVTGN 

PRPNQIRWNRGNESLPERAEAVGETLTLPGLVSA 

DNGTYTCEASNKHGHARALYVLVVYGESRLRPT 

EGGGGAPDPGAVVEAQTSVPYAIVGGILALLVFL 

nCVLVGMVWCSVRQKGSYLTHEASGLDEQGEA 

REAFLNGSDGHKRKEEFFI 


3049 


A 


3159 


882 


VGCTLRVGVMAAAGSRKRRLAELTVDEFLASGF 

DSESESESENSPQAETREAREAARSPDKPGGSPSA 

SRRKGRASEHKDQLSRLKDRDPEFYKPLQENDQ 

SLLNFSDSDSSEEEEGPFHSLPDVLEEASEEEDGA 

EEGEDGDRVPRGLKGKKNSWVTVAMVERWKQ 

AAKQRLTPKLFHEWQAFRAAVATTRGDQESAE 

ANKFQVTDSAAFNALVTFCIRDLIGCLQKLLFGK 

VAKDSSRMLQPSSSPLWGKLRVDEKAYLGSAIQL 

VSCI^ETTVlJ^AVLRfflSVLVPCFLTFPKQCRML 

UO^MVWWSTGEESLRVIJ^VI^VCRHKKDT 

FLGPVLKQMYITYVRNCKFTSPGALPFISFMQWT 

LTELLALEPGVAYQHAFLYIRQLAIHLRNAMTTR 

KKETYQSVYNWQYVHCLFLWCRVLSTAGPSEA 

LQr^VYPLAQVIIGCnOJPTARFTPUlMHCIRALT 

LLSGSSGAFIPVLPFILEMFQQVDFNRKPGRMSSK 

PINFSVILKLSNVNLQEKAYRDGLVEQLYDLTLE 

YLHSQAHCIGFPELVLPWLQLKSFLRECKVANY 

CRQVQQLLGKVQENSAYICSRRQRVSFGVSEQQ 

AVEAWEKLTREEGTPLTLYYSHWRKLRDREIQL 

EISGKERLEDLNFPEIKRRKMADRKDEDRKQFKD 

LFDLNSSEEDDTEGFSERGILRPLSTRHGVEDDEE 

DEEEGEEDSSNSEDGDPDAEAGLAPGELQQLAQ 


3050 






182 


^LDk? 1KSPGSGSS I^PSHl^YLLHF^SlR^^ 

GCCGCSRGCGSGCGGCGSSCGGCGSGCGGCGSG 

RGGCGSGCGGCSSSCGGCGSRCYVPVCCCKPVC 

SWVPACSiTTSCGSCGGSKGGCGSCGGSKGGCGS 

CGC^QSSCCKPCCCSSGCGSSCSQSSCCKPCCCSS 

GCGSSCCQSSCCKPYCCQSSCCKPCSCFSGCGSS 

CCQSSCYKPCCCQSSCCVPVCCQCKI 


3051 


A 


175 


4330 


NIPRWNFQGKSFGVVLVHFSSEEVDMASDSPARS 

LDEIDLSALRDPAGIFELVELVGNGTYGQVYKGR 

HVKTGQIAAIKVMDVTGDEEEEIKQEINMLKKY 

SHHRiNIATYYGAFIKKNPIH3MDDQLWLVMEFCG 

AGSVTDLIKNTKGYTLKEEWIAYICREILRGLSHL 

HQHKVIHRDIKGQNVIXTENAEVKLVDFGVSAQ 

LDRTVGRRmTIGTPYWMAPEVIACDENPDATY 

DFKSDLWSLGITAIEMAEGAPPLCDMHPMRALF 

LIPRNPAPRLKSKKWSKKFQSFIESCLVKNHSQRP 

ATEQLMKHPFIRDQPNERQVRIQLKDHIDRTXKK 

RGEKDETEYEYSGSEEEEEENDSGEPSSILNLPGE 

STLRRDFLRLQLANKERSEALRRQQLEQQQREN 

EEHKRQLLAERQKEUEEQKEQRRRLEEQQRREKE 

LRKQQEREQRRHYEEQMRREEERRRAEHEQEYI 

RRQLEEEQRQLEILQQQLLHEQALLLEYKRKQLE 

EQRQAERLQRQLKQERDYLVSLQHQRQEQRPVE 

KKPLYHYKEGMSPSEKPAWAKEVEERSRLNRQS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence ^Alanine OCystelne, D=Aspartic Add, 
E=Glutatnlc Add, F=PhenyIaJanioe, G^Gtycine, H=Histidine, 
Msoleudne,K«Ly$ine, L=Leudne, M=Methionine, 
N^Asparaginc, P=Prollne, Q=Glutamlne, R=Arginine, S=Serine, 
T-ahreonine, V=Valine, W=Tryptophan, Y»Tyrostoe, 
X=Unknowu, *«Stop eodon, /=possibIe nudeotide deletion, 
Vzpossibfe nudeotide insertion 










SPAMPHKVANRISDPNLPPRSESFSISGVQPARTP 

TMLRPVDPQIPHLVAVKSQGPALTASQSVHEQPT 

KGLSGFQEALNVTSHRVEMPRQNSDPTSENPPLP 

TRIEKFDRSSWLRQEEDIPPKVPQRTTSISPALAR 

KNSPGNGSALGPRLGSQPIRASNPDLRRTEPILES 

PLQRTSSGSSSSSSTPSSQPSSQGGSQPGSQAGSSE 

RTRVRANSKSEGSPVLPHEPAKVKPEESRDrraPS 

RPASYKKAJDEDLTALAKELREXJOEETNRPMKK 

VTDYSSSSEESESSEEEEEDGESETHDGTVAVSDI 

PRLIPTGAPGSNEQYNVGMVGTHGLETSHADSFS 

GSISREGTIMERETSGEKKRSGHSDSNGFAGHINL 

PDLVQQSHSPAGTPTEGLGRVSTHSQEMDSGTE 

YGMGSSTKASFTPFVDPRVYQTSPTDEDEEDEES 

SAAALFTSELLRQEQAKLNEAR1GSVVNVNPTOI 

RPHSDTPEIRKYKKRFNSEILCAALWGVNLLVGT 

ENGLMLLDRSGQGKVYNLINRRRFQQMDVLEG 

LNVLVTISGKKNKLRVYYLSWLRNRIIJINDPEV 

EKKQGWITVGDIJEGCIHYKVVKYERIKFLVIALK 

NAVEIYAWAPKPYHKFMAFKSFADLQHKPLLVD 

LTVEEGQRLKVIFGSHTGFHVIDVDSGNSYDIYIP 

SHI(^MTPHAIVILPKTDGMEMLVCYEDEGVYV 

NTYGRITKDVVLQWGEMPTSVAYIHSNQIMGW 

GEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERN 

DKVFFASVRSGGSSQVFFMTLNRNSMMNW 


3052 


A 


1 


615 


MGQVECGGQKLGNQLEDDSEPAEGKVYSSDEE 

KLEASAGDPAGSEQEEEGSGGDSEDDGFLDSSA 

GGPGALLGPKPKLKGSLGTGAEEGAPVTAGVTA 

PGGKSRRRRTAFTSEQLLELEKEFHCKKYLSLTE 

RSQIAHALKLSEVQVKIWFQNRRAKWKRIKAGN 

VSSRSGEPVRNPKIXTVTIPVTiVNRFAVRSQHQQM 

?OGARP 


3053 


A 


203 -v 




FGVRVPSNT<V,Ja,","I .■^HCMQrSE tvDSECLTSLQP 

LPLPTPPAANEA ' ILQTAAISL WTWAA VQAIERK 

VEIHSRRLLHLEC-RTGTAEKKLASCEKTVTELGN 

QLEGKGAVLGTLLQ?/iGlXQRRI£NLENLLRNR 

NFWILRLPPGDCGDIPKVPVAFDDVSIYFSTPEWE 

KLEEWQKELYKNIMKGNYESLISMDYAINQPDV 

LSQIQPEGEHNTEDQAGPEESEIPTBPSEEPGISTS 

DILSWKQEEEPQVGAPPESKESDVYKSTYADEE 

LVDCAEGLARSSLCPEVPVPFSSPPAAAKDAFSDV 

AFKSQQSTSMTPFGRPATDLPEASEGQVTFTQLG 

SYPLPPPVGEQVFSCHHCGKNLSQDMLLTHQCS 

HATEHPLPCAQCPKHFTPQADLSSTSQDHASETP 

PTCPHCARTFTHPSRLTYHLRVHNSTERPFPCPDC 

PKRFADQARLTSHRRAHASERPFRCAQCGRSFSL 

KISLLLHQRGHAQERPFSCPQCGIDFNGHSALIRH 

QMIHTGERPYPC11X:SKSFMRKEHL 

GERPFSCPHCGKSFIRKHHLMKHQRIHTGERPYP 

CSYCGRSFRYKQTLKDHLRSGHNGGCGGDSDPS 

GQPPNPPGPLITGLETSGLGVNTEGLETNQWYGE 

VJOVTVJVJ V L/ 


3054 


A 


3 


2212 


SCGHKSAYGSYTGLQUFWEDGQELLQHQQLQD 
LRLCVHLRPQSEKVELSLWTLFVVGKGEPSAVR 
EKLGKAGFAAASGPGGRPGAERASTVLNILHLT 
AESRWEPNACNRVSSSPAGVGPLDLPVGPLLYFF 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine C=Cystelne» D=Aspartk Add, 
E=Glutamfc Add, ^Phenylalanine, G^Glydne, H-Histidine, 
I»Isoleudne, K^Lysine, L=Leuclne, M=Methionine, 
N=Asparagine, P=Proiine, Q=GiutamIne, R=Arginine, S=Serine, 
T^Threonine, V=*Vallne, W=Tryptophan, Y=Tyrosine, 
X=Unknown, **^top codon, /=possl We nudeotide deletion, 
V=possible nudeotide insertion 










apwarasflchapqrpltgiglntvrftsefplh 

skdptahkixftgnyu:kijiprpiuia?qgslsdf 

chgtegkdlpsehnvsvegvaqdrspeatlcpq 

ktcpcdicglrlkdilhlj^ehqttrlprqkpfvce 

ayvkgsefsanlprkqvqqnvhnpirteegqas 

pvktcrdhtsdqlstcreggkdfvatagflqce 

vtpsdgepheategvvdfhialrhnkccesgdaf 

nnkstlvqhqrihsrerpyecskcgifftyaadl 

tqhqkvhnrgkpyeccecgkffsqhsslvkhrr 

vhtgesphvcgdcgkffsrssnliqhkrvhtgek 

pyecsdcgkitsqrsnlihhkrvytgrsahecse 

cgksfncnss1jkhwrvhtgerpykcnecgkffs 

hlasliqhqivhtgerphgcgecgkafirssdlmk 

HQRVHTGERPYE(^CGKLFSQSSSLNSHRRLrrr 

GERPYQCSECGKFFNQSSSLNNHRRLHTGERPYE 

CSECGKTFRQRSNLRQHLKVHKPDRPYECSECG 

KAFNQRPTLIRHQKIHIRERSMENVLLPCSQHTPE 

ISSENRPYQGAVKYXLKLVHPSTHPGEVP 


3055 


A 


268 

■ 


2954 


ARRSSSSQGSAAPTPCQVVEASRDQLVAGPSGK 

MGNREMEELIPLVNRLQDAFSALGQSCLLELPQI 

AWGGQSAGKSSVLENFVGRDFLPRGSGIVTRRP 

LVLQLVTSKAEYAEFLHCKGKKFTDFDEVRLEIE 

AETDRWGMNKGISSIPINLRVYSPHVLNLTLIDL 

PGITKVPVGDQPPDIEYQIRMIMQFITRENCL1LA 

VTPANTDLANSDALKLAKEVDPQGLRTIGVITKL 

DLMDEGTDARDVLENKLLPLRRGYVGVVNRSQ 

KDIDGKKDIKAAMIAERKFFI^HPAYRHIADRM 

GTPHLQKVLN(^LTNHIRDTLPNFRNKLQGQLLS 

ffiHEVEAYKWKPEDPTRKTKALLQMVQQFAVD 

FEKJUEGSGIXJVDTLELSC^jAKINRIFHERFPFEIV 

KM T !J^ T EKELP * EIS Y Ak-KTEiGTP TO \ FTP^ 

AlVXlCQlV-v Jl^PSLKS^LVIQELINWivl.C iiC l 

KIANFPRLCEETERIVANHIREREGKTKDQ V*. LLI 

DIQVSYINTNHEDFIGFANAQQRSSQVHKKTi VG 

NQVIRKGWLTISNIGIMKGGSKGYWFVLTAESLS 

WYKDDEEKEKKYMLPLDNLKVRDVEKSFMSSK 

HIFALFNTEQRNVYKDYRFLELACDSQEDVDSW 

KASLLRAGVYPDKSVGNNKAENDENGQAENFS 

MDPQIJERQVETIRNLVDSYMSIINKCIRDLIPKTI 

MHLMINlWKDFINSElXAQLYSSEIXJNTLMEra 

AEQAQRRDEMLRMYQALKEALGIIGDIGTATVS 

TPAPPPVDDSWIQHSRRSPPPSPTTQRRPTLSAPL 

ARPTSGRGPAPAIPSPGPHSGAPPVPFRPGPLPPFP 

SSSDSFGAPPQVPSRP1RAPPSVPSRRPPPSPTRPTT 

IRPLESSLLD 


3056 


A 


1674 


1839 


WRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3057 


A 


1674 


1839 


VVRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3058 


A 


3363 


2525 


FLVKLlLIILCRCIJrISLSRSVQQLRTSFQDHAVWK 

POvlKVLQNAPDEILVVASSMLCNLLLEFSPSKEPI 

LESGAVELLCGLTQSENPALRVNGIWALMNMAF 

QAEQKKADILRSl^TEQLFRLLSDSDLNVLMKT 

LGLLRNLI^IWHTOKIMSTHGKQIMQAVTLILEG 

EHNIEVKEQTLCHLANIAIX3TTAKDLIMTNDDILQ 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=AJanine OCysteine, D»Aspartk Add, 
EXSIutamic Add, ^Phenylalanine, OGIydne, H=H!atidine, 
I^lsoleudne, K=Lysine, LHLeudne, M=Methionine» 
N=Asparagine, {^Proline, Q=G]utamine* R-Argtnlne, S=Serine, 
T=Threonine, V=»Vallne, W=Tryptophan, Y«Tyrosine, 
X==lin known, *=Stop codon, A=pos$ibIe nudeotide deletion, 
V=possible nudeotide insertion 










KIKYYMGHSHVKLQLAAMFCISNLrWNEEEGSQ 
ERQDKLRDMGrVDILHKLSQSPDSNLCDKAKMA 
LQQYLA 


3059 


A 


679 


167 


SS WPSLSSQMHFPSFHLHVAAHY GRDSFVRLLLE 

FKAEVDPLSDKGTTPLQLAIIRERSSCVKILLDHN 

ANIDIQNGFLLRYAVIKSNHSYCRMFLQRGADTN 

LGRLEDGQTPLHLSALRDDVLCARMLYNYGAD 

TOTWreGQTPIAVSISISGSSRPC^ 


3060 


A 


30 


234 


PPLQLDMDPNCYCADGDSCTCAGSCKCKECKCT 

SCKKSCCSCCPAGCAKCAQGCICKGATDKCSCC 

A 


3061 


A 


428 


720 


VRRDVRQQATWAMASDLDFSPPEVPEPTFLENL 

LRYGLFLGAIFQLICVLAIIVPIPKSHEAEAEPSEPR 

SAEVTRKPKAAWSVNKRPKKETKKKR 


3062 


A 


1589 


276 


WKQKYEPLGLDAAGIEEAITAVGSFILKANELLQ 
VmSSMKNFKAFFRWLYVAMLRMTEDHVLPELN 
KMTQKDITFVAEFLTEHFNEAPDLYNRKGKYFN 
VERVGQYLKDEDDDLVSPPNTEGNQWYDFLQN 
SSHLKESPLLFPYYPRKSLHFVKRRMENIIDQCLQ 
KPADVTGKSMNQAICIPLYRDTRSEDSTRRLFKFP 
FLWNNKTSNIJfVXLFTILEDSLYKMCILRRHTDIS 
QSVSNGLIAIKFGSFTYATTEKVRRSIYSCLDAQF 
YDDETVTVVLKDTVGREGRDRLLVQLPLSLVYN 
. SEDSAEYQITGTYSTRLDEQCSAIPTRTMHFEKH 
WRLLESMKAQYVAGNGFRKVSCVLSSNLRHVR 
YFEMDIDDEWELDESSDEEEEASNKPVKIKEEVL 
SESEAENQQAGAAALAPEIVIKVEKLDPELDS 


3063 


A 


50 


849 


DKMPSIFAYQSSEVDWCESNFQYSELVAEFYNTF 
SNIPFFIFGPLMMLLMHPYAQKRSRYIYVVWVLF 
NfflGLFSMYFHMTLSFLGQLLDEIADLWLLGSGYS 
F^MPIl<T*F?SFL^ iVYZTLLSFL 
RPTVNAYALNSii^LkiL i 3 VCQEYRK1SNKELRH 
LIEVSA/VLWAVAi^rSWISDB^LLCSFWQRIHFFYL 
HSIWHVIJSITFPYGiv!\ 7 TMALVDANYEMPGETL 
KVRYWPRDSWPVGLPYA?B31GDDKDC 


3064 


A 


1523 


925 


AATMADGQMPFSOIYPSRLRRDPFRDSPLSSRLL 

DDGFGMDPFPDDLTASWPDWALPRLSSAWPGTL 

RSGMVPRGPTATARFGVPAEGRTPPPFPGEPWK 

VCVNVHSFKPEELMVKTKIX3YVEVSGKHEEKQ 

QEGGWSKNFTKKIQLPAEVDPVTVFASLSPEGLL 

IIEAPQVPPYSTFGESSFNNELPQDSQEVTCT 


3065 


A 


230 


2929 


LSTSLTGSHLFSLGNHSTRENLNAGNFNFPSEGH 

LVRSTGPGGSFAKHMVAQCVSPKGPLACSRTYF 

FGATHVPYLGGDSKLPKKTEQIRLLSQrYAAVIE 

AVLAGIACYAKTSSLTKAKEVAEQTLGSGLDSFE 

LIPFKAALRSKMTFHIHAVNNQGRIVPLDSEDSLS 

FVKTACMAVYDIPDLLGGNGCLGSWFSESFLTS 

QILVKEKDGTVTTETSSVVLTAAVPRFCSWLVED 

NEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLY 

SSNLQSWPEEGNVHFFSSGLLJFSHCRHGSIIISKD 

HMNSISFYDGDSTSWAAIXIDFKSSLLPHLPVHF 

HGSSNFLMIALFPKSKTYQAFYSEVFSLWKQQDN 

SGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPA 

GEKRSSLKLLSAKLPELDWFLQHFAISSISQEPVM 

RTHIJPVlXQQAEINTmRIESDKVnSIVTGLPGCH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=»Aspartic Acid, 
E°€lnt»mic Add, ^Phenylalanine, OGIydne, H=Histidine, 
I=Isoleudne, K«Lyslne, L^Leudne, MNMcttrionine, 
N»Asparagine, P**Proline, Q=Glutamine, R»Arginine, S=Serine, 
T»Tbreonine, V=VaIine, W^Tryptopban, Y«Tyrosine, 
X=Unknown, *=Stop codon, /^possible nudeotidc deletion, 
V=possible nudeotide insertion 










ASELCAFLVTIJHKECGRWMVYRQIMDSSECFHA 

AHFQRYLSSALEAQQNRSARQSAYIRKKTRLLV 

VIX^YTDVIDWQALQTHPDSNVKASFnGAITA 

CV^MSCYMEHRFIJTKCIJXJCSQGLVSNVVFT 

SHTTEQRHPLLVQLQSLERAANPAAAFILAENGIV 

TRNEDIELELSENSFSSPEMLRSRYLMYPGWYEG 

KLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKA 

IQSSIKPSPFSGNIYHILGKVKFSDSERTMEVCYNT 

LANSLSIMPVLEGPTPPPDSKSVSQDSSGQQECiX 

VFIGCSLKEDSDCDWLRQSAKQKPQRKALKTRG 

MLTQQEIRSIHVKRHLEPLPAGYFYNGTQFVNFF 

GDKTDFHPLMDQFMNDYVEEANREIEKYNQELE 

QQEYHDLFELKP 


3066 


A 


130 


588 


LAPLRCQPGTRTQPRSHPAANDPSAAMSAAGAR 

GLRATYHRLLDKVELMLPEKLRPLYNHPAGPRT 

VFFWAPIMKWGLVCAGLADMARPAEKLSTAQS 

AVLN1ATGHWSRYSLVIIPKNWSLFAVNFFVGAA 

GASQLFRIWRYNQELKAKAHK 


3067 


A 


2 


1016 


EFARRRVFIAAREMSLLRSLRVFLVARTGSYPAG 

SLLRQSFQPRHTFYAGPRLSASASSKELLMKLRR 

KTGYSFVNCKKALETCGGDLKQAEIWLHKEAQ 

KEGWSKAAKLC^RKTKEGUGLLQEGNTTVLVE 

VNCETDFVSRNLKFQLLVQQVALGTMMHCQTL 

KDQPSAYSKGFLNSSELSGLPAGPDREGSLKDQL 

ALAIGKLGEKMILKRAAWVKVPSGFYVGSYVHG 

AMQSPSLHKLVLGKYGALVICETSEQKTNLEDV 

GRRLGQHWGMAPLSVGSLDDEPGGEAETKML 

SQPYLLDPSITLGQYVQPQGVSWDFVRFECGEG 

EEAAETE 


3068 


A, 


3 


1679 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGL 
VPITDDTSHAGPTC^^ 

RRKDWSCSLLVASLAGAV .IbFLYGY^XSVVNA 

PTPYIKAFVNESWERRHGRPIDPDTLTLLWSVTV 

SIFAIGGLVGTLIVKMIGKVlXjRKHTLLANNGFAI 

SAALLMACSLQAGAFEMLIVGRFIMGIDGGVALS 

VLPMYLSEISPKEIRGSLGQVTAIFiaGVFTGQLL 

GLPELLGKESTWPYLFGVIWPAWQLLSLPFLP 

DSPRYIJLLEKHNEARAVKAFQTTTLGKADVSQEV 

EEVLAESRVQRSIRLVSVLELLRAPYVRWQWT 

VIVTMACYQLCGLNAIWFYTNSIFGKAGIPPAKIP 

YVTLSTGGIETLAAVFSGLVIEHLGRRPLLIGGFG 

LMGLFFGTLTTTLTLQDHAPWVPYLSIVGILAIIAS 

FC^GPGGIPFILTGEFFQQSQRPAAFIIAGTVNWLS 

NFAVGLUT>FIQKSIJ3TYCFLWATiaTGAIYLYF 

VLPETKNRTYAEISQAFSKRNKAYPPEEKIDSAV 

TDGKINGRP 


3069 


A 


861 


300 


AAGAWSAMPKAKGKTRRQKFGYSVNRKRLNR 

NARRKAAPRIECSHIRHAWDHAKSVRQNLAEMG 

IAVDPNRAWLRKRKVKAMEVDIEERPKELVRK 

PYVLNDLEAEASLPEKKGNTLSRDLIDYVRYMV 

ENHGEDYKAMARDEKNYYQDTPKQIRSKINVY 

KRFYPAEWQDFLDSLQKRKMEVE 


3070 


A 


325 


2019 


LAEPEVATDSGQQADLPAEGGDPRAEASCSVLH 
SKPHAMADSRDPASDQMQHWKEQRAAQKADV 
LTTGAGNPVGDKLNVITVGPRGPLLVQDVVFrD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D»Aspartlc Add, 
E=Clutaniic Add, ^Phenylalanine, G=Glydae, H=Hlstfdine, 
I=lsoleudne, K=Lysioe, L=Leudne, M=Methionine> 
N^Asparagine, P^ProUne, Q=Glutamine, R»Arginlne, S"Serine } 
T^Threonine, V»VaIine, W^Tryptophan, Y=Tyrosint, 
X«Unknown, *=Stop eodon, /^possible nndeotide deletion, 
V=possible nudeotide insertiOD 










EMAHFDREI^ERVVHAKGAGAFGYFEVTHDIT 

KYSKAKVFEHIGKKTPIAVRFSTVAGESGSADTV 

RDPRGFAVKFYTBDGNWDLVGNNTP1FFIRDPILF 

PSFIHSQKRMHJTHLKDPDMVWDFWSLRPESLH 

QVSFLFSDRGEPDGHRHMNGYGSHTFKLVNANG 

EAVYCKFHYKTIXJGIKNLSVEDAARLSQEDPDY 

GIRDLFNAIATGKYPSWTFYIQVMTFNQAETFPF 

NPFDLTKVWPHKDYPLBPVGKLVLNRNPVNYFA 

EVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDT 

HRHRLGPNYLHIPVNCPYRARVANYQRDGPMC 

MQDNQGGAPNYYPNSFGAPEQQPSALEHSIQYS 

GEVRRFNTANDDNVTQVRAFYVNVLNEEQRKR 

LCENUGHUa>AQIHQKKAVKNF^ 

IQALLDKYNAEKPKNAJHTFVQSGSHLAAREKA 

NL 


3071 


A 


1 


1187 


SLGWLERPPALSRAAGDGARRLSGSRRGDVWLT 

SSAAGLLRSVAGGSWCGGQLRARGGSGRCVAR 

AMTGNAGEWCLMESDPGVFTELDCGFGCRGAQ 

VEEIWSLEPENFEKLKPVHGLIFLFKWQPGEEPA 

GSVVQDSRLI^nFFAKQVINNACATQAIVSVLLN 

CTHQDVHLGETLSEFKEFSQSFDAAMKGLALSN 

SDVTRQVHNSFARQQMFEFDTKTSAKEEDAFHF 

VSYVPVNGRLYELDGLREGPIDLGACNQDDWIS 

AVRPVBEKRIQKYSEGEIRFNLMAIVSDRKMIYEQ 

KIAELQRQLAEEEPMDTDQGNSMLSAIQSEVAK 

NQMLffiEEVQKLKRYXIENIRRKHNYLPFIMELL 

KTLAEHQQLIPLVEKAKEKQNAKKAQETK 


3072 


A 


103 


2775 


RLRTLAPPGLLLGPPLVPDSRRRHQASLTPLfflSG 

SPQLVGRGDRKLRTEVLVPPAALPAETRQRRSER 

LPRRTCPRGGAPGPGRSR1PRSLPPPSAIPGLRSPV 

WA ACIjGGGCrr7r?^RGKCQA-»? ^ *7.HRSTM."^ 

LGAGGDGKivvio£,^ VRSETAPDSYKVQDKKNA 

SSRPASAISGQ^fNHSGNKPDPPPVLRVDDRQRL 

ARERREEREKQL-V.REIVWLEREERARQHYEKH 

LEERICIQU^EEQRQKEEPJEIRAAVEEKRRQRLEED 

KERHEAVVRRTMERSQKPKQKHNRWS WGGSLH 

GSPSmSADPDRRSVSTMNLSKYVDPVISKRLSSS 

SATLLNSPDRARRLQLSPWESSVVNRLLTPTHSF 

LARSKSTAALSGEAVIPICPRSASCSPIIMPYKAAH 

SRNSMDRPKLFVTPPEGSSRRRnHGTASYKKERE 

RENVLFLTSGTRRAVSPSNPKARQPARSRLWLPS 

KSLPHLPGTPRPTSSLPPGSVKAAPAQVRPPSPGN 

IRPVKREVKVEPEKKDPEKEPQKVANEPSLKGRA 

PLVKVEEATVEERTPAEPEVGPAAPAMAPAPAS 

APAPASAPAPAPVPTPAMVSAPSSTVNASASVKT 

SAGTTDPEEATRLLAEKRRLAREQREKEERERRE 

QEELERQKREELAQRVAEERTTRREEESRRLEAE 

QAREKEEQLQRQAEERALREWEEAERAQRQKEE 

EARVREEAERVRQEREKHFQREEQERLERKKRL 

EEIMKRTRRTEATDKKTSDQRNGDIAKGALTGG 

TEVSALPCTTNAPGNGKPVGSPHVVTSHQSKVT 

VESTPDLEKQPNENGVSVQNENFEEHNLPIGSKP 

SRLDVTNSESPEIPLNPILAFDDEGTLGPLPQVDG 

VQTQQTAEVI 


3073 


A 


67 


2415 


PPRVCRDHVCLICWDPIAGTGGSRSTMPALPLDQ 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A»Alanine OCystelne, INAspartk Add, 
E=Glutamic Add, ^Phenylalanine, G«Glytine, HNHlstidine, 
Msoiendne, K«Lysine» L^Leudne, M=*f ethlonine, 
N-Asparagine, P«ProIine, Q=Glntamlne, R=Arginlne, S^Serine, 
T-Tbreonlne, V-Valine, W-Tryptophan, Y^Tyrosine, 
X»Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possibIe nudeotide Insertion 










LQITHKDPKTGKLRTSPALHPEQKADRYFVLYKP 

PPKDNIPALVEEYLERATFVANDLDWLLALPHD 

KFWCQVIFDETLQKCLDSYLRYVPRKFDEGVAS 

APEVVDMQKM,HR5VFLTn J RMSTHKESKDHFIS 

PSAFGEILYNNFIJFDIPKILDLCVLFGKGNSPLLQ 

KNflGNIFTQQPSYYSDI^ETUTILQVFSNILQHC 

GLQGDGANTTPQKLEERGRLTPSDMPLLELKDIV 

LYLCDTCTTLWAFUDIFPLACQTFQKHDFCYRLA 

SFYEAAIPEMESAIKKRRLEDSKLLGDLWQRLSH 

SRKKIMEIFHIILNQIClXPn^ESSCDMQGFIEEFL 

QIFSSLLQEKRFLRDYDALFPVAEDISLLQQASSV 

LDETRTAYILQAVESAWEGVDRRKATDAKDPSV 

IEEWGEPNGVTVTAEAVSQASSHPENSEEEECM 

GAAAAVGPAMCGVELDSLISQVKDLLPDLGEGFI 

lACl^YYHYDPEQVINNILEERLAPTLSQLDRNL 

DREMKPDPTPLLTSRHNVFQNDEFDVFSRDSVDL 

SRVHKGKSTRKEENTRSLLNDKRAVAAQRQRYE 

QYSVWEEVPLQPGESLPYHSVYYEDEYDDTYD 

GNQVGANDADSDDELISRRPFTIPQVLRTKVPRE 

GQEEDDDDEEDDADEEAPKPDHFVQDPAVLREK 

AEARRMAFLAKKGYRHDSSTAVAGSPRGHGQS 

RETTQERRKKEANKATRANHNRRTMADRKRSK 

GMIPS 


3074 


A 


3 


251 


GEARSPPPAAALLDMDPETCPCPSGGSCTCADSC 
KCEGCKCTSCKKSCCSCCPAECEKCAKDCVCKG 
GEAAEAEAEKCSCCQ 


3075 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WEAKKARLEWELKEEEKKKECAARGEDYEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

^YHRLTKQTKPDMETYilRLFEK^ 

LLHGTHVPSTEEIDRVV[ViLi : OIEKRD^YSRRR 

PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 


3076 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WEAKKARLEWE1JCEEEKKKECAARGEDYEKVK 

LmSAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 

IXHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 

PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 


3077 


A 


1 


968 


FRLRPRRACAQLLWHPAAGMASWAKGRSYLAP 

G1XQGQVAIVTGGATGIGKAIVKELIJELGSNVVI 

ASRKLERLKSAADELQANLPPTKQARVIPIQCNIR 

NEEEVhmLVKSlLDTFGKINFLVNNGGGQFLSPA 

EfflSSKGWHAVLETNLTGTFTMCKAVYSSWMK 

KHGGSIVNIIVPTKAGFPIj\VHSGAARAGVYNLT 

KSLAFEWACSGIRINCVAPGVIYSQTAVENYGSW 

GQSFFEGSFQKIPAKRIGVPEEVSSWCFLLSPAA 

arliuv&VlJVDGGRoLY IHoYbVFDHDN WPKGA 

GDIJSWKKMKETFKEKAKL 


3078 


A 


2 


3508 


FVRESGKAPVTFDDITVYLLQEEWVLLSQQQKEL 

CGSNKLVAPLGPTVANPELFRKFGRGPEPWLGS 

VQGQRSLLEHHPGKKQMGYMGEMEVQGPTRES 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-AIanine OCysteine, D=»Aspartic Acid, 
E«Glutamic Add, ^Phenylalanine, OGlytfne, H=Histidine, 
Msoleudne, K=Lysine, L=Leudne, M»Methionine, 
N=Asparagioe, P^Proline, Q=€!utamine, R=Arginine, S=Seriot, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nudeotide deletion, 
\=possible nudeotide insertion 










GQSU>PQKXAYLSHI^TGSGHIEGDWAGRNRKL 

IJKPRSIQKSWFVQFPWIJMNEEQTALFCSACREY 

PSIRDKRSRLIEGYTGPFKVETLKYHAKSKAHMF 

CVNAIAARDPIWAARFRSERDPPGDVLASPEPLF 

TADCPIFYPPGPLGGFDSMAELLPSSRAELEDPGG 

DGAff AMYLDCISDLRQKETTDGIHS SSDINILYN 

DAVESCIQDPSAEGLSEEVPWFEELPWFEDVA 

WFTREEWGMLDKRQKELYKDVMRMNYELLAS 

LGPAAAKPDLISKLERRAAPWIKDPNGPKWGKG 

RPPGNKKMVAVREADTQASAADSALLPGSPVEA 

RASCCSSSICEEGDGPRRIKK.TYRPRSIQRSWFGQ 

FPWL\nDPKETKLFCSACIErUmHDKSS 

YTGPFKVETLKYHEVSKAHRIX^VNTVEIKEDTPH 

TALWEISSDLMANMEHFFNAAYSIAYHSRPLND 

FEKILQLLQSTGTV1LGKYRNRTACTQFIKYISETL 

KREILEOVRNSPCVSVLLDSSTDASEQACVGIYIR 

YFKQMEVKESYITLAPLYSETADGYFETrVSALD 

ELDIPFRKPGWVVGLGTDGSAMLSCRGGLVEKF 

QEVIPQLLPVHCVAHRLHIAVVDACGSIDLVKK 

CDRHIRTVFKFYQSSNKRLNELQEGAAPLEQEIIR 

LKDLNAVRWVASRRRTLHALLVSWPALARHLQ 

RVAEAGGQIGHRAKGMLKIJ^GFHFVKFCHFL 

LDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVAL 

ESLRHQAGPKEEEFNASFKDGRLHGICLDKLEVA 

EQRFQADRERTVLTGIEYLQQRFDADRPPQLKN 

MEVFDTMAWPSGIELASFGNDDILNLARYFECSL 

PTGYSEEALLEEWLGLKTIAQHLPFSMLCKNALA 

QHCRFPLLSKLMAVWCVPISTSCCERGFKAMN 

RIRTDERTKLSNEVLNMLMMTAVNGVAVTEYD 

PQPAIQHWYLTSSGRRFSHVYTCAQVPARSPASA 

RLRKEEMG/ : ^ / EEPRTQKP? T T ^S?F \AEV: KT 

'I TtePPERLi .YPHlSQEAPGI^l . 


3079 


A 


343 


1513 


FaPLEPRLCSLGGWGALQAGEt 1 i 1PSRAGCGRE 

GATMGCTLS AEERAALERSKAIEiCvT KEDGIS AA 

KDVKLLIXGAGESGKSllVKQMKIIHi^^GFSGED 

VKQYKPWYSNTIQSLAAIVRAMDTLGIEYGDK 

ERKADAKMVCDWSRMEDTEPFSAELLSAMMR 

LWGDSGIQECTNRSREYQLNDSAKYYLDSLDRIG 

AADYQPTEQDILRTR\aCTTGIVETHFITKNLHFR 

LFDVGGQRSERKKWIHCFEDVTAIIFCVALSGYD 

QVIJIEDETTNRMHESIJCLFD^ 

LFLNKKDIFEEKIKKSPLTICFPEYTGPSAFTEAVA 

YIQAQYESKNKSAHKEIYSHVTCATDTNNIQFVF 

DAVTDVIIAKNLRGCGLY 


3080 


A 


41 


997 


EARTARELTDGVTDGLTMADQPKPISPIJKNLLA 

GGFGGVCLVFVGHPLDTVKVRLQTQPPSLPGQPP 

MYSGTFDCFRKTLFREGITGLYRGMAAPOGVTP 

MFAVCFFGFGLGKKLQQKHPEDVLSYPQLFAAG 

MLSGVFTTGIMTPGERIKCLLQIQASSGESKYTGT 

LDCAKKLYQEFGIRGIYKGTVLTLMRDVPASGM 

YFM1 Yfc WLKNIFT PEuKJCVSELSAFRILVAGGIA 

GIFNWAVAIPPDVLKSRFQTAPPGKYPNGFRDVL 

RELIRDEGVTSLYKGFNAVMIRAFPANAACFLGF 

EVAMKFLNWATPNL 


3081 


A 


3 


1996 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add seqaence (A=>Alanlne OCystdne, D=Aspartic Add, 
&=Glutamic Add, ^Phenylalanine, G-Glydne, H=Hlstidine, 
I«Isoleudne, K=Lysine> L^Lcudne, M=Methionine> 
N»Asparagine, P=Proline, Q=GIutaminc, R=Arginine, S-Serine, 
T«Threonlne, V«Va!ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *^Stop eodon, ^possible nodeotide deletion, 
V-possible nucleotide insertion 










NAASGSNASGSESDQDERGDSGQPSNKELFGDD 

SEDEGASHHSGSDNHSERSDNRSEASERSDHEDN 

DPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSEA 

EGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDE 

ERAQGSDEDKLQNSDDDEKMQNTDDEERPQLS 

DDERQQLSEEEKANSDDERPVASDNDDEKQNSD 

DEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSE 

ADSDTEVPKDNSGTMDLFGGADDISSGSDGEDK 

PPTPGQPVDENGLPQDQQEEEPIPETRIEVEIPKV 

NTDLGNDLYFVKLPNFLSVEPRPFDPQYYEDEFE 

DEEMLDEEGRTRLKLKVENmWRIRRDEEGNEI 

KESNARIVKWSDGSMSLHLGNEVFDYYKAPLQG 

DHNHLFIRQGTGLQGQAVFKTKLTFRPHSTDSAT 

HRKMTLSLADRCSKTQKIRILPMAGRDPECQRTE 

MDCKEEERUlASnUtESQQRRMREKQHQRGLSAS 

YLEPDRYDEEEEGEESISLAAEKNRYKGGIREERA 

RIYSSDSDEGSEEDKAQRL1JCAKKLTSDEVRPNL 

FNSRGLSCTQEPTALNEELTDQAGTN 


3082 


A 


3 


921 


VEFCLPASADSSSLVAASI^GVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELWRIASLEVENQSLRGWQELQQAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQ Y AEKKAKKPAL VAKS SILLDVKP WDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 


3083 


A 

■ 


3 


921 


VEFCLPASADSSSLVAASlAGVRKMATNrTJUiE 

K^?/:T?^-!lCYDLV^:€*Fy^Q^WGPVAG/ 1 p RQEN 

G,i6V^.DIARARENiQKSLAGSSGPGASSGTSGi ! 

HGF.VVRIASLEVENQSLRGVVQELQQAISKLEA 

RLN VLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RIJIQY AEKKAKKPAL VAKSSILLDVKP WDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 


3084 


A 


128 


4050 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPI^PALPKYKLADYRYGREEMLALFIJCDNKIPS 

DIXDKEFLPILQEEPLPPljy^WFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 
MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 
VETPVVGAPGMGSVSTEPPDEEGLKHLEQQAEK 
MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AIanlne OCysteine, D=Aspartic Add, 
E-Glntamlc Add, ^Phenylalanine, G^GIydne, H-Histidine, 
l^Isoteudne, K«Lysine, L^Letttine, M=Methionint, 
N=Asparaglne, P=Proline, Q=GIutaminc, R=-Arginlne, S=5erine > 
Threonine, V=VaUne, W=Tryptopnan, Y^Tyroslne, 
X=Unknown> *=Stop codon, A=possib!e nudeotide deletion, 
V=possible nudeotide insertion 










EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFIMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNIIPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDU>U)TTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGELRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQFRT J .RKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKT FF.FRERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 

PNRARWHHSNLOTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNKNNASI^SVGVSNRQNKKVEEEEKLLK 

IJQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEEETLDDY 


3085 


A 


128 


4050 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGS1TS 

PPI^PALPKYKLADYRYGREEMLALFLKDNKIPS 

DLLDKEFLPDLQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

KEFIRSESENWRIFREE^ ■* ;OEDF.DC ^WRT AC*^, 

DGEK WR?;-^ GPRSAv ' WREHMERRRR- 1. tul 5 

DRDDERGYKRVRSGSGSIDDDRDSLPEWCi.^ 3A 

EEEMGTTOSSGAFLSIJKKVQKEPIPEEQEMDFx'lV* 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFB 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQ1PSDTASPLLILPPPVPNPSPTLRP 

VETPWGAPGMGSVSTEPDDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 

EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFIMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTIJCMRISDQNIIPSVTRSVSWDTCSIWELQ 

PTASQPTVWEGGSVWDLPlJyrTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWARRFFFAQRRLE 

ENRLRMEEEAARIJRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 
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SKQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCystdne, D=Aspartic Add, 
E-=Glutamk Add, ^Phenylalanine, OGrydne, H-flistidine, 
I=Isoleudne, K=Lysine, L^Leudne, M^Methionlne, 
N^Asparaglne, P=Proline, Q=Glutamine, R-Arginine, S-Serine, 
T^Threonine, V»VaIIne, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possib!e nadeotide deletion, 
\=possible nadeotide insertion 










P^ARNNTrlSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKMCNNASl^KSVGVSNRQNKKVEEEEKLLK 

IJFC^VNKAQIXiFrQWCEQMIilALNTANNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEEETLDDY 


3086 


A 


675 


1334 


IJHPAATSTAWLHVPPGI^MALSWVLTVLSLLPL 

I^QIPIXANLVPWITNATLDRITGKWFYIASAF 

RNEEYNKSVQEIQATFFYFTPNKTEDTIFLREYQT 

RQD^IYNTTYLNVQRENGTISRYVGGQEHFAH 

IJJIJU^TKTYMIAFDVNDEKNWGLSVYADKPET 

TKEQLGEFYEALDCLRIPKSDVVYTDWKKDKCE 

PLEKQHEKERKQEEGES 


3087 


A 


1 


1575 


CTPVARSMATTATCTRFTDDYQLFEELGKGAFS 

VVRRCVKKTSTQEYAAKII^^XKLSARDHQKLE 

REARICRLLKHPNIVRLHDSISEEGFHYLVFDLVT 

GGELFEDIVAREYY SEADASHCIHQILES VNHIHQ 

HDIVHRDLKPENLLLASKCKGAAVKLADFGLAIE 

VQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVD 

IWACGVILYILLVGYPPFWDEDQHKLYQQIKAG 

AYDFPSPEWDTVTPEAKNLINQMLTTNPAKRITA 

DQALKHPWVCQRSTVASMMHRQETVECLRKFN 

ARRKLKGAILTTMLVSRNFSAAKSLLNKKSDGG 

VKPQSNNKNSLVSPAQEPAPLQTAMEPQTTVVH 

NATDGIKGSTESCNTTTEDEDLKVRKQEIIKITEQ 

LIEAINNGDFEAYTKICDPGLTSFEPEALGNLVEG 

MDFHKFYFENLLSKNSKPmTTILNPHVHVIGED 

£ACTA\;rX'^ 

M.NVHYHCSGAPAAPLQ 


3088 


A 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDFAE 

C T - KWSAELARLGESIMDGKQGGMEKjSKPAGPR 

b^OnUXSNPLMGDAVSDWSPMHEAAIHGHQL 

SLRNLISQGWAVN1ITADHVSPLHEACLGGHLSC 

VKEIJmGAQVNGVTADWHTPLFNACVSGSWD 

CVNIXLQHGASVQPESDI^SPIHEAARRGHVEC 

WS1JLAYGGNIDHKISHLGTPLYLACENQQRACV 

KKLLESGADVNQGKGQDSPLHAVARTASEELAC 

LLMDFGADTQAKNAEGKRPVELVPPESPLAQLF 

I^REGPPSIJ^QLCEUJURKCTGIQQHHKITKLVLP 

EDLKQFLLHL 


3089 


A 


73 


432 


DMAGLMTIVTSLLFLGVCAHHIIPTGSVVLPSPCC 
MFFVSKRIPENRWSYQLSSRSTCLKAGVIFTTKK 
GQQFCGDPKQEWVQRYMKNLDAKQKKASPRA 
RAVAVKGPVQRYPGNQTTC 


3090 


A 


4627 


611 


LMEAGGGGGALPAGVETMVLTLGESWPVLVGR 

RFLSLSAADGSDGSHDSWDVERVAEWPWLSGTI 

RAVSHTDVTKKDLKVCVEFDGESWRKRRWIEV 

YSLLRRAFLVEHNLVLAERKSPEISERTVQWPAIT 

YKPLLDKAGLGSITSVRFLGDQQRVFLSKDLLKP 

IQDVNSLRLSLTDNQIVSKEFQALIVKHLDESHLL 

KGDKNLVGSEVKIYSLDPSTQWFSATVVNGNPA 

SKTLQVNCEEIPALKIVDPSLIHVEVVHDNLVTC 
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SfiQH) 
NO: 


Method 


Predicted 

beginning 

nndeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nndeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alanine OCysteine, D=Aspartk Add, 
E=Glntaoiic Add, ^Phenylalanine, OGlyeine, H«=Histidine, 
l-lsoienclnc, K-Lysine, L^Lendnc, M=Methlonine, 
N^Asparagioe, P^Proline, Q=Glatamtne, R=Arginine, S^erine, 
T»Tnreonine, V-Valine, W«Tryptophan, Y»Tyrosine, 
X=Unknown, *=Stop.codon, /=possible nndeotide deletion, 
V=possible nucleotide insertion 




■ 






GNSARIGAVKRKSSENNGTLVSKQAKSCSEASPS 

MCPVQSVPTTVFKEILLGCTAATPPSKDPRQQST 

PQAANSPPNLGAKIPQGCTKQSLPEEISSCLhrrKS 

EALRTKPDVCKAGLLSKSSQIGTGDLKILTEPKGS 

CTQPKThTTOQEhnU.ESVPQALTGLPKECLPTKAS 

SKAELEIANPPELQKHLEHAPSPSDVSNAPEVKA 

GVNSDSPNNCSGKKVEPSALACRSQNLKESSVK 

VDNESCCSRSNNKIQNAPSRKSVLTDPAKLKKLQ 

QSGEAFVQDDSCVNIVAQLPKCRECRLDSLRKD 

KEQQKDSPWCRFFHFRRLQFNKHGVLRVEGFLT 

PhOCYDNEAIGLWLPLTKNVVGIDLDTAKYILANI 

GDHFCQMVISEKEAMSTBBPHRQVAWKRAVKG 

VREMCDVCDTTIFhnJHrVVVCPRCCFGVCVDCYR 

MKRKNCQC^AAYKTFSWLKCVKSQIHEPENLM 

PTQHPGKALYDVGDIVHSVRAKWGIKANCPGSN 

RQFKLFSKPASKEDLKQTSLAGEKPTLGAVLQQ 

NPSVLEPAAVGGEAASKPAGSMKPACPASTSPLN 

WlJVDLTSGNVNKENKEKQPTMPILKNEIKCLPPL 

PPL^KSSTVLHTFNSTILTPVSNNNSGFLRNLLNSS 

TGKTENGLKNTPKILDDIFASLVQNKTTSDLSKR 

PQGLTIKPSILGFDTPHYWLCX)hnQ,lXLQDPNNK 

SNWNVFRECWKQGQPVMVSGVHHKLNSELWK 

PESFRKEFGEQEVDLVNCRTNEIITGATVGDFWD 

GFEDWNRLKNEKEPMVLKLKDWPPGEDFRDM 

MPSRFDDLMANIPLPEYTRRDGKLNLASRLPNYF 

VRPDLGPKMYNAYGLITPEDRKYGTTNLHLDVS 

DAAKS^MVYVGIPKGQCEQEEEVLKTIQDGDSDE 

LTIKRFmGKEKPGALWHIYAAKDTEKIREFLKK 

VSEEQGQENPADHDPIHDQSWYLDRSLRKRLHQ 

EYGVQGWAIVQFLGDWFTPAGAPHQVHNLYSC 

: r:^A^FVSPEWVKHCF\v V*7";BFRY r SQTK^r 7 

DKLQVKNVV ;^ . j 


3091. 


A ' 


97 


1838 


KRGARRGGWKRKMPSTDLLMLKAFEPYLEiio ; V~| 

YSTKAKNYVNGHCTKYEPWQLIAWSVVWTLLI ! 

VWGYEFWQPESLWSRFKKKCFKLTRKMPIIGRK 

IQDKL^TKDDISKNMSrTJCVDKEYVKAIJ^SC^ 

LSSSAVLEKLKEYSSMDAFW QEGRASGTVYSGE 

EKLTELLVKAYGDFAWSNPLHPDIFPGLRKIEAEI 

VRIACSLFNGGPDSCGCVTSGGTESILMACKAYR 

DlJU^KGIKTPEIVAPQSAHAAFNKAASYFGMKJ 

VRWLTKMMEVDVRAMRRAISRNTAMLVCSTP 

QFPHGVIDPVPEVAKLAVKYKIPLHVDACLGGFL 

IVFMEKAGYPLEHPFDFRVKGNTreiSADTHKYGY 

APKGSSLVLYSDKKYRNYQFFVDTDWQGGIYAS 

PTIAGSRPGGISAACWAALMHFGENGYVEATKQI 

IKTARFLKSELEMKGIFVFGNPQLSVIALGSRDFD 

IYT^Sl^MTAKGWNLNQI^rTPSIHFCITLLHAR 

KRVAIQFLKDIRESVTQIMKNPKAKTTGMGAIYG 

MAQTTVDRl^GAELSSVFLDSLYSTDTVTQGSQ 

MNGSPKPH 


3092 


A 


79 


2652 


LCSQNSPEDWVNFSSEKQKRYPWYWTGRKLRSE 
RAMKIQKKLTGCSRLMLLCLST .ET J ,LBAG AGNIH 
YSVPEETDKGSFVGNIAKDLGLQPQELADGGVRI 
VSRGRMPLFALNPRSGSLITARRIDREELCAQSM 
PCLVSF>nLVEDKMKIJTVEVEIlDINDNTPQFQL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCysteine, D=Aspartie Add, 
E=Clutamlc Add, F^Pbenylalanine, OGIydne, HMHistidine, 
l^lsoleudne, K=Lysine, L^Leudne, M=Met bio nine, 
N=Asparagine ( P^Prolinc, Q=Glutamine, R=Arginine, S=Scrine, 
Threonine, V=Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nndeotide deletion, 
\ppossiblc nudeotide insertion 










EELEFKMNEITTPG111VSLPFGQDLDVGMNSLQS 

YQLSSNPHFSLDVQQGADGPQHPBMVLQSPLDR 

EEEAVHHLDLTASDGGEPVRSGTLRIYIQVVDAN 

DNPPAFTQAQYHINVPENVPLGTQLLMVNATDP 

DEGANGEVTYSFHNVDHRVAQIFRLDSYTGEISN 

KEPLDFEEYKMYSMEVQAQIXjAGLMAKVKVLI 

KVLDVNDNAPEVTTTSVTTAVPENFPPGTDALISV 

HDQDSGDNGYTTCTIPGNIJFKLEKLVDNYYRL 

VTERTLDRELISGYNITITAEDQGTPALSTETfflSL 

LVTDINDNSPVFHQDSYSAYIPENNPRGASFSVR 

AHDLDSNENAQITYSLIEDTIQGAPLSAYLSINSD 

TGVLYALRSFDYEQFRDMQLKVMARDSGDPPLS 

SNVSLSLFLLDQNDNAPEILYPALPTDGSTGVEL 

APRSAEPGYLVTKWAVDRDSGQNAWLSYRLL 

KASEPGLFSVGLHTGEVRTARALLDRDALKQSL 

WAVQDHGQPPLSATVTLTVAVADRIPDILADLG 

SLEPSAKPNDSDLTLYLVVAEAAVSCVFLAFVIV 

IXAHRLRRWHKSRLLQASGGGLASTPGSHFVGV 

DGVRAFLQTYSHEVSLTADSRKSHLIFPQPNYAD 

TLISQESCEKKGFLSAPQSLLEDKKEPFSQVNFCD 

ECISYLEKNNS 


3093 


A 


1 


3868 


PPDNQK1X5LLEALLKIGDWQHAQNIMDQMPPYY 

AASHBXIAlJUCKLIfflTIEPLYRSVTSWAVDHAG 

FLESDPCDSTVGHLLSRVGVPKGAKGSPVNALQ 

NKRAPKQAESFEDLRRDVFNMFCYLGPHLSHDPI 

LFAKWRIGKSFMKEFQSDGSKQEDKEKTEVILS 

CLl^FFEXJVLLPSLSLMDCNACMSEELWGMFKT 

FPYQHRYRLYGQWKNETYNSHPLLVKVKAQTED 

RAKYIMKRLTKENVKPSGRQIGKLSHSNPTELFD 

YVCFEILSQIQKYDNLITPWDSLKYLTSLNYDVL 

AITLSVC^,* \ANPO0Zr^irf ^nTITSSW^QSL> 

SFCGA vv ^VL-g?LAGLLQVYANQLKAGKSFI)L 

HLKEWQ ; *MAGIEITEENCIMEQ 

KAEGGYFGQIRNTKKSSQRLKDALLDHDLALPL 

CLLMAC^RNGVlFOEGGEKHIJaLVGKLYIXJCH 

DTLVQFGGFLASNLSTEDYIKRWSIDVLCNEFHT 

PrmAAFFLSRPMYAHmSSKYDELKKSEKGSKQ 

QHKVHKYITSCEMVMAPVHEAVVSLHVSKVWD 

DISPQrTATFWSLTMYDIAWHTSYEREVNKLK 

VQMKAIDDNQEMPPNKKKKEKERCTAIXJDKIX 

EEEKKQMErTVQRVLQRIJCLEKDNWLLAKSTKN 

ETTIXFLQLCIFPRCIFSAIDAVYCARFVELVHQQ 

KTPNFSTIJLCYDRWSDnYTVASCTENEASRYGR 

FLCCMLETVTRWHSDRATYEKECGNYPGFLTIL 

RATGFIXKjNKADQLDYENFRHVVHKWHYKLT 

KASVHCLETGEYTOIRNILIVLTKILPWYPKVL>^ 

GQALERRVHKICQEEKEKRPDLYALAMGYSGQL 

KSRKSYNOPENEFHHKDPPPRNAVASVQNGPGG 

GPSSSSIGSASKSDESSrEKrDKSRERSQCGVKAV 

NKASSTTPKGNSSNGNSGSNSNKAVKENDKEKG 

KEKEKEKKEK 1 PATTPEARVLCj J^uJMiKPKJEfiR 

Ph^EKARETKERTPKSDKEKEKFKKEEKAKDE 

KFKTTVPNAESKSTQEREREKEPSRERDIAKEMK 

SKENVKGGEKTPVSGSLKSPVPRSDIPEPEREQKR 

RKIDTHPSPSHSSTVKDSLIELKESSAKLYINHTPP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

correspooding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence ^A«=Alanlne OCysteioe, D=Aspartic Add, 
E=Glutamlc Add, ^Phenylalanine, Glycine, H«Histidine, 
I=IsoIeucine, K»Lysine, L=Leueine, M=Methioiune, 
N=Asparagine, P=Proline, Q=G1utamine, R=Arginine, S^Serlne, 
T=OThreonlne, V=Valine, W^Tryptopban, Y»Tyrosine, 
X^Unknown, *^top eodon, ^possible nodeotide deletion, 
V=possible nucleotide insertion 










PI^KSKEREMDKKDIJDKSRERSREREKKDEKDR 

KERKRDHSNNDREWPDLTKRRKEENGTMGVSK 

HKSESPCESPYPNEKDKEKNKSKSSGKEKGSDSF 

KSBKMDKISSGGKKESIOIDKEKIEKKEKRDSSGG 

KEEKKHHKSSDKHR 


3094 


A 


2 


891 


AMLGTREPSRRGAGAVQAEVSERLAMAGPQQQ 

PPYLHLAELTASQFLEIWKHFDADGNGYIEGKEL 

ENFFQELBKARKGSGMMSKSDNFGEKMKEFMQ 

KYDKNSDGKIEMAELAQILPTEENFLLCFRQHVG 

SSAEFMEAWRKYDTDRSGYEEANELKGFLSDLL 

KKANI^YDEPKLQEYTQTILRMFDLNGDGKLGL 

SEMSRLLPVQENFJJLKFQGMKLTSEERnJAJFTFY 

DKDRSGYIDEHE1JDAIJJCDLYEKNKKEMNIQQL 

TNYRKSVMSLAEAGKLYRKDLEIVLCSEPPM 


3095 


A 


1685 


700 


RRPTGRPGALGAPAAGRVGMPLHVKWPFPAVPP 

LTWTLASSVVMGLVGTYSCFWTKYMNHLTVHN 

REVLYELIEKRGPATPLITVSNHQSCMDDPHLWG 

HJCIJIHIWNLKLMRWTPAAADICFIXELHSHFFS 

LGKCVPVCRGAEFFQAENEGKGVLDTGRHMPG 

AGKRREKGDGVYQKGMDFILEKLNHGDWVHIF 

PEGKVNMSSEFLRFKWGIGRLIAECHLNPIILPLW 

HVGMNDVLPNSPPYFPRFGQKITVLIGKPFSALP 

VLERLRAENKSAVEMRKALTDFIQEEFQHLKTQ 

AEQLHNHLQAWEIGLACCLLDSWPAQSWG 


3096 


A 

_ 


6642 

" r 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVM 

EAQPEWLRAEVKRLSHELAETTREKIQAAEYGL 

AVLEEKHQLKLQFEELEVDYEAIRSEMEQLKEAF 

GQAHTNHKXVAADGESREESLIQESASKEQYYV 

RKVLELQTELKQLRNVLTNTQSENERLASVAQE 

LKEINQNVEIQRGRLRDDIKEYKFRF A RLLQDYS 

?J EEENISLQ XQVS* 'LRO> T OVEFEGI '" r £W^LT 

EETEYLNSQIJBD^JRLKC * r o iQLEEALETLKTER 

EQKNSLRKEl^HYMSINDSFYreHlJWSLDGIJGF 

SDDAAEPNNDAEALVNGFEHGGLAKLPLDNKTS 

TPKKEGLAPPSPSLVSDLLSELNISEIQKLKQQLM 

QMEREKAGLLATLQDTQKQLEHTRGSLSEQQEK 

VTRLTENLSALRRLQASKERQTALDNEKDRDSH 

EDGDYYEVDINGPEILACKYHVAVAEAGELREQ 

LKALRSTHDEAREAQHAEEKGRYEAEGQALTEKV 

SLLEKASRQDRELLARLEKELKKVSDVAGETQG 

SLSVAQDELVTFSEELANLYHHVCMCNNETPNR 

VMLDYYREGQGGAGRTSPGGRTSPEARGRRSPI 

LLPKGLLAPEAGRADGGTGDSSPSPGSSLPSPLSD 

PRREPMMYNLIAIIRIXJIKHLQAAVDRTTELSRQ 

RIASQELGPAVDKDKEALMEEIIJQJK5LLSTKRE 

QITTLRT\^KANKQTAEVA1ANLKSKYENEKAM 

VTETMMKLRNELKALKEDAATFSSLRAMFATRC 

DEYTTQLDEMQRQLAAAEDEKKTLNSLLRMAIQ 

QKLALTQRLELLELDHEQTRRGRAKAAPKTKPA 

IPSVSHTCACASDRAEGTGLANQVFCSEKHSIYC 

u 


3097 


A 


1 


879 


MVKVVPATRGNLPRSQLTGTHQHCQPREPKITA 
SERLRRRPRATARLRAHAAPPEPPLAVFAPPSDR 
KELLALPVACDPVIASVMSWVQAASLIQGPGDK 
GDVFDEEADESLLAQREWQSNMQRRVKEGYRD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=>AIanine OCysteine, D=Aspartie Add, 
E=Glutaratc Add, ^Phenylalanine, G=Clydne, H=HbtWIne, 
I-boleucine, K^Lysiue, L^Leucine, M=Methionine, 
N»Asparagine, P=Proiine t Q=Glutamine, R=ArginJne, S=Serine, 
T^Threonine, V=VaIine* W^Tryptophan, Y^rosine, 
X=Un known, *^Stop codon, ^possible nucleotide deletion, 
ossicle nucleotide insertion 










GIDAGKAVTLQQGFNQGYKKGAEVILNYGRLRG 
TI^ALLSWCHLH]snWSTLINKI>^IJ)AVGQCEE 
YVLKHLKSITPPSHYWIXDSIEDMDLCHVVPAE 
KKmEAKI)ERLCENNAEFNKNCSK5HSGIDCSYV 
ECCRTQEHAHSGKPKPHMDFGTDSQF 


3098 


A 


2 


505 


GAATLLRSASSAARKAAEAEQVWLHLHRYLSA 

DRRVUjLREWGRPASERECSLCQRLKRELNMGD 

VEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGL 

FGRKTGQAPGYSYTAANKNKGIIWGEDTLMEYL 

EOTKKYIPGTKMIFVGIKKKEERADLIAYLKKAT 

NE 


3099 


A 


144 


1386 


WAVGQARSFPSHPRMSSWIWSRRWSPSVALRVT 

CTSTSSQRWTVLALSKPGSQQQVSMHTPAPGPPT 

AGHTEPPSEPPRRARVAKYRAKFDPRVTAKYDIK 

ALIGRGSFSRWRVEHRATRQPYAIKMIETKYRE 

GREVCESElJlVIJEtRVRHANnQLVEVFETQERVY 

MVMELATGGELFDRIIAKGSFTERDATRVLQMV 

LIXjVRYIJHAmiTrlRDIJCra 

TDFGLASARKKGDDCLMKTTCGTPEYIAPEVLV 

RKPYTNSVDMWALGVIAYILLSGTMPFEDDNRT 

RLYRQILRGKYSYSGEPWPSVSNLAKDFIDRLLT 

VDPGARMTALQALRHPWWSMAASSSMKNLHR 

SISQNLLKRASSRCQSTKSAQSTRSSRSTRSNKSR 

RVRERELREL 


3100 


A . 


3 


1500 


ARWNGRWVQVPAWPGPGCGTNASGERQRQLPR 

AWRPVGRTLGSEPIALAWSPPLYLFPIPLPSWAVS 

QPTPTLGTMFADLDYDIEEDKLGIPTVPGKVTLQ 

KDAQNLIGISIGGGAQYCPCLYIVQVFDNTPAAL 

DGTVAAGDEITGVNGRSIKGKTKVEVAKMIQEV 

KGEVTIHYNKLQADPKQGMSLDIVLKKVKHRLV 

ELYKG;.Uj^:.^.mLPAFYf^SQTr^ . 
AFGDVF5 VIGVREPQPAASEAFVKFADAHRSIEK 
FGIRLLKim? ^TDLNTYLNKAIPDTRLTIKKYL 
DVKFEYLSYCLICVKEMDDEEYSCIALGEPL^ 
STGNYEYRLIIJRCRQEARARFSQMRKDVLEKME 
IIJXJKHVQDIVFQLQRLVSTMSKYYNDCYAVLR 
DADVFPmVDIAHTTLAYGLNQEEFTDGEEEEEE 
EDTAAGEPSRDTRGAAGPLDKGGSWCDS 


3101 


A 


1173 


197 


QGMDSKQQCVKLNDGHFMPVLGFGTYAPPEVP 

RSKALEVTKLAIEAGFRrnDSAHLYNNEEQVGLA 

IRSKIADGSVKREDIFYTSKLWSTFHRPa 

ENSLKKAQLDYVDLYLIHSPMSLKPGEELSPTDE 

NGKVIFDIVDIXITrWEAMEKCKDAGLAKSIGVS 

NFNEtRQLEMILNKTOLKYKPVCNQVECHPYFNR 

SKLLDFCKSKDIVLVAYSALGSQRDKRWVDPNS 

PVLLEDPVLCALAKKHKRTPALIALRYQLQRGV 

WLAKSYNEQRIRQNVQVFEFQLTAEDMKAIDG 

LDRNLHYFNSDSFASHPNYPYSDEY 


3102 


A 


144 


1098 


EQPRPPPCGRRPLPLGSAPCRVRLGRAPRQAPAM 

SMLPSFGFTQEQVACVCEVLQQGGNLERLGRFL 

WSLPACDHLHKNESVLKAKAVVAFHRGNFREL 

YKII^HQFSPHNHPKLQQLWLKAHYVEAEKLR 

GRPLGAVGKYRVRQKFPLPRTIWDGEETSYCFK 

EKSRGVIJ^WYAHNPYPSPREKRELAEATGLTT 
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Seqib 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add rcsidne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A s Alanlne OCystdne, B=Asparttc Add, 
E=Glutamlc Add, ^Phenylalanine, G^GIydne, H^Hlstidine, 
I^boteucine, KHLysine, L^Leudne, M<=Metbloninc, 
N-Asparagjne, P=Prollne, Q=G1utaminc, R->Arginlne» S=Serine, 
T^Tbreonine, V=Va!ine, W=Tryptophan, Y=Tyroslne, 
X^Unknown, *=Stop codon, ^possible nndeotide deletion, 
V^possiblc oudeotide insertion 










TQVShTWFKNRRQRDRAAEAKERENTENNNSSSN 
KQNQLSPLEGGKPLMSSSEEEFSPPQSPDQNSVLL 
LQGNMGHARSSNYSLPGLTASQPSHGLQTHQHQ 
LQDSLLGPLTSSLVDLGS 


3103 


A 


111 


1582 


LVYSWGCHIMADNDTDRNQTEKLLKRVRELEQ 

EVQRLKKEQAKNKEDShnRENSSGAGKTKRAFD 

FSAHGRWWALmYMGWGYQGFASQENTNNTI 

EEKLFEALTKTRLVESRQTSNYHRCGRTDKGVS 

AFGQVISLDLRSQFPRGRDSEDFNVKEEANAAAE 

EIRYTHILNRVLPPDIRILAWAPVEPSFSARFSCLE 

RTYRYFFPRADLDIVTMDYAAQKYVGTHDFRNL 

CKMDVANGVINFQRTILSAQVQLVGQSPGEGRW 

QEPFQLCQFEVTGQAFLYHQVRCMMAILFLIGQ 

GMEKPEimELLMEKNPQKPQYSMAVEFPLVLY 

IXXFENVKWTYDQEAQEFNITHLQQLWANHAV 

KTHMLYSML(^LDTVPVPCGIGPKMDGMTEWG 

KVKPSVIKQTSAFVEGVKMRTYKPLMDRPKCQG 

LESRIQHFVRRGRIEHPHBLJ 7 HEEETKAKRDCNDT 

LEEDNTNLETPTKRVCVDTEIKSn 


3104 


A 


227 


1519 


VTLIKMNAMLETPELPAVFDGVKLAAVAAVLYV 

IVRCLNLKSPTAPPDLYFQDSGLSRFLLKSCPLLT 

KEYIPPLIWGKSGfflQTALYGKMGRVRSPHPYGH 

RKJ1TMSDGATSTFDLFEPLAEHCVGDDITMVICP 

GIANHSEKQYIRTFVDYAQKNGYRCAVLNHLGA 

LPNIELTSPRMFTYGCTWEFGAMVNYIKKTYPLT 

QLVWGFSLGGNIVCKYLGETQANQEKVLCCVS 

VC(^YSALRAQETFMQWDQCRRFYNFLMADN 

MKJOILSHRQALFGDHVKKPQSLEDTDLSRLYTA 

TSLMQIDDNVMRKFHGYNSLKEYYEEESCMRYL 

HRIYVPLMLVNAADDPLVHESLLTIPKSLSEKRE 

NY: T ^^^^LFGGHLCFrao^VLFPFPLTWMT- C: V 

V£YANAICQWE3WKiX^Sm^ 


3105 


A 


1 


i2 I 


MGLLLMELASAVLGSFLTLLAQFFLLYRRQPEPP 

ADEAARAGEGFRYKPVPGLLLREYLYGGGRDE 

EPSGAAPEGGATPTAAPETPAPPTRETCYFLNATI 

LF1J 7 RELRDTALTRRWVTKKIKVEFEELIX5TKTA 

GRLIJBGLSIJRDVFIXjETVPFIKTIRLVRPVVPSAT 

GEPDGPEGEALPAACPEELAFEAEVEYNGGFHLA 

IDVDLVFGKSAYLFVKLSRWGRLRLVFTRVPFT 

HWFFSFVEDPLIDFEVRSQFEGRPMPQLTSnVNQ 

LKKnKRKHTIJWYKIRFKPIlT^ 

HimQQWALTEGRLKVTLLECSRLLIFGSYDREA 

NVHCTLELSSSVWEEKQRSSIKTGTISLTAVFMG 

WHRVSEAFPGLWYKLLVDLPFWGLEDGGPLLT 

VPLRQCPG 


3106 . 


A 


972 


468 


MAAAGAGRLRRVASALLLRSPRLPARELSAPAR 

LYHKKVVDHYENPRNVGSLDKTSKNVGTGLVG 

APACGDVMKLQIQVDEKGKIVDARFKTFGCGSA 

IASSSLATEWVKGKTVEEALTIKNTDIAKELCLPP 

VKLHCSMLAEDADCAALADYKLKQEPKKGEAE 

KK 


3107 


A 


106 


1221 


TCQDVRSVFSLVRANIFGEESTAGAGWHREEDM 
RKELQLSLSVTLLLVCGFLYQFTLKSSCLFCLPSF 
KSHQGLEALLSHRRGIVFLETSERMEPPHLVSCS 
VESAAKIYPEWPVVFFMKGLTDSTPMPSNSTYPA 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

acid residue of 

peptide 

seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanlne OCysteine, D»Aspartic Add, 
E=Glntamic Add, ^Phenylalanine, G=€rydne,H=HlstidIne, 
Msoleudne, K^Lysint, L^Leucint, M=Methionine, 
N^Asparagine, P=Proline, 0=Glutamlne, R=ArginJoe, S-Serine, 
T^Threontne, V°Valioe, W-Tryptophan, Y°Tyrosine, 
X=Unknown, *«Stop codon,/==possible nudeottde ddetion, 
V=possJble nudeotide insertion 










FSFI^AIDNVFICTLDMKRLLEDTPLFSWYNQINA 

SAERKWIjnSSDASRLAIIWKYGGIYMDTDVISIR 

PIPEENFLAAQASRYSSNGIFGFLPHHPFLWECME 

NFVEHYNS AIW GNQGPELMTRMLRVWCKLEDF 

QEVSDLRCLNISFLHPQRFYPISYREWRRYYEVW 

DTEPSFNVSYAIJILWNHMNQEGRAVmGSNTLV 

ENLYRKHCPRTYRDLIKGPEGSVTGELGPGNK 


3108 


A 


1612 


839 


EVALFCFEMAAGMYLEHYLDSIENLPFELQRNFQ 

LMRDLDQRTEDLKAEEDKLATEYMSSARSLSSEE 

KLALLKQIQEAYGKCKEFGDDKVQLAMQTYEM 

VDKHIRRLDTDLARFEADLKEKQIESSDYDSSSS 

KGKKKGRTQKEKKAARARSKGKNSDEEAPKTA 

QKKLKLVRTSPEYGMPSVTFGSVHPSDVLDMPV 

DPNEFITCLCHQVSYGENOGCDNPIX^IEWFHFA 

CVGLTTKPRGKWFCPRCSQERKKK 


3109 

i 

t 

i 


A 


1 


2613 


MVAVRAAGPREGASQDEAGTVWAPMTGCPCQC 

RPGPSWLLVDTLEPETAYPVQRPGPEQAGNQRL . 

QMKRAQFGPHDWLSLPVPPGPSWLLVDTLEPET 

AYQFSVLAQNKLGTSAFSEVVTVNTLAFPITTPEP 

LVLVTPPRCLIANRTQQGVLLSWLPPANHSFPIDR 

YIMEFRVAERWELLDDGIPGTEGEFFAKDLSQDT 

WYEFRVLAVMQDLISEPSNIAGVSSTDIFPQPDLT 

EIXjLARPVIAGIVATICFLAAAILFSTLAACFVNK 

QRKRKLKRKKDPPLSITHCRKSLESPLSSGKVSPE 

SIRTLRAPSESSDDQGQPAAKRMLSPTREKELSL 

YKKIXRAISSKKYSVAKAEAEAEATTPIELISRGP 

DGRFVMDPAEMEPSLKSRRIEGFPFAEETDMYPE 

FRQSDEENEDPLVPTSVAALKSQLTPLSSSQESYL 

PPPAYSPRFQPRGLEGPGGLEGRLQATGQARPPA 

PRPFHHGQYYGYLSSSSPGEVEPPPFYVPEVGSPL 

ssv: r sppj,pt^gpf^:.^^^enge: t AonS' t t ^lt 

QTrTGGRSPEPWGR :.• >rf^v^TPAMMfi»HQLP 

PCDVPESLQPKAGLPRGLPPTSLQVPAAYPGILSL 

EAPKGWAGKSPGRGPVKVFPAAKWQDRPMQPL 

VSQGQLRHTSQGMGIPVLPYFE?'AEPGAHGGPST 

FGLDTRWYEPQPRPRPSPRQARRAEPSLHQWLQ 

PSRLSPLTQSPLSSRTGSPELAAiRARPRPGLLQQA 

EMSEITLQPPAAVSFSRKSTPSTGSPSQSSR5GSPS 

YRPAMGFTTLATGYPSPPPGPAPAGPGDSLDVFG 

QTPSPRRTGEELLRPETPPPTLPTLGKLRRDRPAP 

ATSPPERALSKL 


3110 


A 


88 


924 


IIXiSRTMSLTNTKTGFSVKDILDLPDTNDEEGSV 

AEGPEEENEGPEPAKRAGPLGQGALDAVQSLPL 

KNPFWSSDNPYTRWLASTEGLQYSLHGLAAGA 

PPQDSSSKSPEPSADESPDNDKETPGGGGDAGKK 

RKRRVl^SKAQTYELERRFRQQRYLSAPEREHLA 

SLIRLTr^XJVKIWFQNHRYKMKRARAEKGMBVT 

PLPSPRRVAVPVLV1UX5KPCHALKAQDLAAATF 

QAGEPFSAYSAQSLQHMQYNAQYSSASTPQYPT 

AHPLVQAQQWTW 


3111 


A 




901 


PQVAQT ATJWP^nP AT WT>OCTJQ*\7T>/^"KTD AT POUT I XJ 

GTTLPGGNQRELARQKNMKKQSDSVKGKRRDD 
GLSAAARKQRDSTPRDSEIMQQKQKKANEKKEE 
PK 


3112 


A 


3641 


1555 


APMLQIHHFSFKLIFQNIHKSKFISQRLSQNADST 
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NO: 


Method ■ 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alantne OCystelne, D^Aspartic Add, 
E-Glotamfc Add, ^Phenylalanine, OGrydne, H=Histidine, 
l-lsolcudne, KNLysine, L^Lcucine, M=Methiooine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«Vaiine, W=Tryptophan, Y=Tyroslne, 
X»Unknown, *=Stop eodoo, /^possible nudeotide deletion, 
\=possibie nudeotide insertion 










RHTb^NTHYSDlJVAVNCCIJTRNWCNEFFLKS 

CHFAQEREGSGDLCNSRAEKTKSAACVIFRRFPV 

APLIPYPLITKEDINAIEMEEDKRDLISREISKFRDT 

HKKLEEEKGKKEKERQEIEKERRERERERERERE 

RREREREREREREREKEKERERERERDRDiU)RTK 

ERDRDRDRERDRDRDRERSSDRNKDRSRSREKS 

RDRERERERERERERERERERERERERERERERE 

REREKDKKRDREEDEEDAYERKKIJERjKLREKEA 

AYQERIJCNWEIRERKKTREYEKEAEREEERRRE 

MAKEAKRLKEFLEDYDDDRDDPKYYRGSALQK 

RLRDREKEMEADERDRKREKEELEE1RQRLLAE 

GHPDPDAELQRMEQEAERRRQPQIKQEPESEEEE 

EEKQEKEEKREEPMEEEEEPEQKPCLKPTLRPISS 

APSVSSASGNATPNTPGDESPCGII1PHENSPDQQ 

QPEEHRPKIGLSLKLGASNSPGQPNSVKRKKLPV 

DSVFNKFEDEDSDDWRKRKLVPLDYGEDDKNA 

TKGTVNTEEKRKHIKSLIEKIPTAJCPELFAYPLDW 

SIVDSEMEIUUI^WINKKnEYIGEEEATLVDLVC 

SKVMAHSPPQSIIJJDVAMVLDEEAEVFIVKMWR 

LLIYETEAKKIGLVK 


3113 


A 


1 


669 


VCAGIRDPCSTPLAKPAAGGAENLSFGKQPGLET 
NIUCMTTPNKTPPGADPKQLERTGTVREIGSQAV 
WSLSSCKPGFGVDQLRDDNLETYWQSDGSQPHL 
VMQFRRKTTVKTLCIYADYKSDESYTPSKISVRV 
GNNFHl^QEIRQLELVEPSGWIHVPLTDNHKKPT 
RTFMIQIAVLANHQNGRDTHMRQIKIYTPVEESSI 
GKFPRCTTEDFMMYRSIR 


3114 


A 


1 


1613 


MTSKEESRRQQPTAGPAGQGKLPSPSEPQLPTPP 

TRSLHHFRRPLSPSREAQAH1APSSELHLPQSQSA 

GPPPLGAGTEVELWPGRDEGSRGALPGSSOVKF 

\^ T CTVPFPVSIXi7RTLra 

VQFTiG WRSLLGRTLGTEvINl to .TviMAQlLI^SH 

IJKATVIP1®VKMIJ>YFGI^^ 

REYYRLLNVEEGCSADEVRESFHKLAKQYHPDS 

GSNTADSATFIRIEKAYRKVl^rrVIEQTNASQSK 

GEEEEDVEKFKYKTPQHRHYLSFEGIGFGTPTQR 

EKHYRQFRADRAAEQVMEYQKQKLQSQYFPDS 

VIVKNIRQSKQQKITQAIERLVEDLIQESMAKGDF 

DNl^GKGKPLKKFSDCSYTOPMTHNLNRILIDNG 

YQPEWILKQKEISDTEEQLREAILVSRKKLGNPMT 

PTEKXQWNHVCEQFQENIRKLNKRINDFNLIVPI 

LTRQKVHFDAQKEIVRAQKIYETUKTKEVTDRN 

PNNLDQGEGEKTPEIKKGFLNLMDLVEIY 


3115 


A 


1 


2036 


FRHRCXjCLSYCRSRRGIRRVEPLRRARARVGPRF 

RPIX^RMEflRSNFKSNLHKVYQAM 

FSGISDGPSVSALTNGFDTPEERYQKLKKHSMDF 

LLFQFGLCHTKYDYTDSKYITKSFNFYVFPKPFNR 

SSPDVKFVCQSSSEDFLASQGFDFNKGFRKGIPYL 

NQEEERQLREQYDEKRSQANGAGALSYVSPNTS 

KCPVTIPEDQKKFIDQVVEKffiDLLQSEENK^ 

EPCTGFQRKLIYQTLSWKYPKGIHVE1IJETEKKE 

RYTVISKVDEEERXRREQQKHAKEQEELNDAVG 

FSRVIHAIANSGKLVIGHNMI1JDVMHTVHQFYC 

PLPADl^EFKEMTTCWPRLLDTKLMASTQPFKD 

HNNTSLAELEKRLKETPFWPKVESAEGFPSYDT 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residne of 
peptide 
sequence 


Amino add sequence (A*=Alanine OCysteine, D=Aspartk Add, 
E=GIntamlc Add, ^Phenylalanine, G=Glycint, H=Histidine, 
I«IsoIeudne, KpLysIne, L=Leudne, Ivf-Methioiiine, 
N=Asparagine, PHProllne, Q^Glntaminc, R»Arginine, S^Serine, 
T»Threonine, V-Valine, W-Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, possible nndeotide deletion, 
\=possibIe nudeotide Insertion 










ASEQLHEAGYDAYITGLCFISMANYLGSFLSPPKI 

HVSARSKLIEPFrmmjvmV^ 

QPKRDHVLHVTFPKEWKTSDLYQLFSAFGNIQIS 

WIDDTSAFVSI^QPEQVKIAVNTSKYAESYRIQT 

YAEYMGRKQEEKQIKRKWTEDSWKEADSKRLN 

PQCIPYTLQNHYYIUWSFTAPSTVGKKNLSPSQE 

EAGLEDGVSGEISDTBLEQTDSCAEPLSEGRKKA 

KKIiCRMKKELSPAGSISKNSPATLFEVPDTW 


3116 


A 


3 


1443 


TREAPMALAVAPWGRQWEEARALGRAVRMLQ 

RLEEQCVDPRLSVSPPSLRDLLPRTAQLLREVAH 

SRRAAGGGGPGGPGGSGDFLLIYLANLEAKSRQ 

VAALLPPRGRRSANDELFRAGSRLRRQLAKLAII 

FSHMHAELHALFPGGKYCGHMYQLTKAPAHTF 

WRESCGARCVLPWAEFESLLGTCHPVEPGCTAL 

ALRTTIDLTCSGHVSIFEFDVFTRIJQPWrTLLKN 

WQIXAVNHPGYMAFLTYDEVQERLQACRDKPG 

SYIFRPSCTRIXjQWMGYVSSIXjSILQTIPANKPLS 

QVIJUBGQKIXjFYLYPDGKTHNPDLTELGQAEPQ 

QRIHVSEEQLQLYWAMDSTFELCKICAESNKDV 

KIEPCGHLLCSCCLAAWQHSDSQTCPFCRCEIKG 

WEAVSIYQFHGQATAEDSGNSSDQEGRELELGQ 

VPLSAPPLPPRPDLPPRKPRNAQPKVRLLKGNSPP 

AALGPQDPAPA 


3117 

f 


A 


296 


3547 

■ 


ERHSSPLLQHILTHALMRNKKHSNNWLAQHWF 

QSSIILCFSPVGRTLRVRARKFPAIVNCTAIDWFH 

AWPQEALVSVSRRHEETKGffiPVHKDSISLFMAH 

VHTTVNEMSTRYYQNERRHhTyTTPKSFLEQISLF 

KNLLKKKQNEVSEKKERLVNGIQKLKTTASQVG 

DliCARLASQEAELQLRNHDAEALITKIGLQTEKV 

SREKTIADAEERKVTAIQTEVFQKQRECEADLLK 

AEPr ' >^ATA/,LNTriV»^^ELSG; *r*>™frr 

NVT. AVMVLLAPRG; , rr^fWKAAKVFMGK 

YDDFLQAUNYDKEHDrK- VCLKWNEHYLKDPEF 

NPNLERTKSFAAAGLCA^ 7CTUKFYEVYCDVEP 

KRQAIAQANLELAAATEKlJE^iB^KKLVVSANYD 

IEKSEKIRWGQSIKSFEAQEKTLCGDVLLTAAFVS 

YVGPFTRQYRQELVHCKWVPFLQQKVSIPLTEG 

LDLISMLTDDATIAAWNNEGLPSDRMSTENAAIL 

THCERWPLVIDPQQQGIKWIKNKYGMDLKVTH^ 

GQKGFLNAIETAIj^FGDVIlJENLEETIDPVLDPL 

IXJRKITKKGKYIRIGDKECEFNKNFRl^^ 

PHYKPEIXJAQlTLL>JFIVrEDGLEAQLLAEVVSI 

ERPDI^KIXLVLTKHQM)FKIELKYLEDDLIJLRL 

SAAEGSFLDDTKLVERLEATKTTVAEIEHKVIEA 

KENERKINEARECYRPVAARASLLYFVINDLQKI 

NPLYQFSLKAFNVLFHRAIEQADKVEDMQGRISI 

LMESITHAVFLYTSQALFEKDKLTFLSQMAFQIL 

LRKXEmPLELDFIXRFIVEHTHLSPVDFLTSQSW 

SAIKAIAVMEEFRGIDRDVEGSAKQWRKWVESE 

CPEKEKLPQEWKXKSLIQKLIIJLRAMRPDRMTY 

ALKNr VEEKJLG AKY VER I KLDL VKAFEE6SPATP 

IFFILSPGVDALKDLEILGKRLGFITDSGKFHNVSL 

GQGQETVAEVALEKASKGGHWVILQNVHLVAK 

WLGTLEKLLERFSQGSHRDYRVFMSAESAPTPD 

EHHPQGLLENSIKITNEPPTGMLANLIiAALYNro 
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SEQ n> 
NO: 


Method 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, B=Aspartic Acid, 
E-Glutamie Acid, ^Phenylalanine, G*=Glydne, H^Histidine, 
I=Isoleucinc, K=Lysine, L^Leudnc, M=Methionine, 
N=Asparaglne, P^Proline, Q=GhrtamJne, R=ArgJnine, S-^Serine, 
T»Threonioe, V«Vaiine, Ws»Tryptopban, Y^Tyrosine, 
X«Unknown, *^top codon, ^-possible nucleotide deletion, 
V=possible nucleotide insertion 










Q 


3118 


A 


1 


226 


PYSLSTSCLGSPTSPRLEMDFNCSCATGGSCTCTG 
SCKCKECKCNSCKKSECGAISRNLGLSQVRGRKP 
ELGMEE 


3119 


A 


1254 


4133 


PLATLTMEEQGHSEMEEPSESHPHIQLLKSNREL 
LVTmRNTQCLVDNLLKNDYFSAEDAETVCACPT 
QPDKVRKHJDLVQSKGEEVSEFFLYLLQQLADAY 
VDLRPWLLEIGFSPSLLTQSKVWNTDPVSRYTQ 
QLRHHLGRDSKFVLCYAQKEELLLEEIYMDTIME 
LVGFSNESLGSLNSI^CLLDHTTGILNEQGETEFIL 
GDAGVGKSMUXJRLQSLWATGRLDAGVKFFFH 
FRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEE 
WAFIJLRFPHVAliHTFT>GIJ)ELJiSD^ 
SCPWEPAHPLVLLANLLSGKLLKGASKLLTART 
GIEVPRQFLRKKVLLRGFSPSHLRAYARRMFPER 
ALQDRLLSQLEANPNLCSLCSVPLFCWIIFRCFQH 
FRAAFEGSPQLPIXnMTLTDVFLLVTEVHLNRM 
QPSSLVQRimSPVEILHAGRDTLCSLGQVAHR 
GMEKSLFVFTQEEVQASGLQERDMQLGFLRALP 
ELGPGGDQQSYEFFHLTLQAFFTAFFLVLDDRVG 
TQELLRFFQEWMPPAGAATTSCYPPFLPFQCLQG 
. SGPAREDLFKNKDHFQFTNLFLCGLLSKAKQKLL 
RHLVPAAALRRKRKALWAHLFSSLRGYLNSLPR 
VQVESFNQVQAMPTFIWMLRCIYETQSQKVGQL 
AARGICANYLKLTYCNACSADCSALSFVLHHFP 
KRLALDLDNNNLNDYGVRELQPCFSRLTVLRLS 
VNQITDGGVKVLSEELTKYKTVTYLGLYNNQITO 
VGARYVTKILDECKGLTHLKLGKNKITSEGGKY 
LALAVKNSKSISEVGMWGNQVGDEGAKAFAEA 
LRNHPSLTTLSLASNGISTEGGKSLARALQQNTSL 
.T* WLTQ>^EL:!r?F> 7 AESLAE.\£ x ?r Tr TLK T !I WL 
iQi f ^ITAKGTAQLADAIXJSNTGrrEICLNGr i uO> 
EEAKVYEDEKRHCF 


3120 


A 


43 


1004 


QL WGFAAGSDSRPAMGCDGGTEPKRHELVKGPK 

KVL^/DKDAELVAQWNYCTLSQEILROTIVACE 

LGRLYNKDAVffiFLLDKSAEKALGKAASHIKSIK 

NVTELKLSDNPAWEGDKGNTKGDKHDDLQRAR 

FrCPWGLEMNGRHRFCFLRCCGCVFSERALKEI 

KAEVCHTCGAAFQEDDVTVLNGTKEDVDVLKTR 

MEERRLRAKIJEKKTKKPKAAESVSKPDVSEEAP 

GPSKVKTGKPEEASLDSREKKTNLAPKSTAMNE 

SSSGKAGKPPCGATKRSIADSEESEAYKSLFTIHS 

SAKRSKEESAHWVTHTSYCF 


3121 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVWEYSSEI^KHQLYIDETWISNIPTNLR 

VUlSIIJEhOlSKIQKI^DVSAQMEYCRTPCTVS 

CNEPWSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYK(^FGNVATOTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLLEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A« Ala nine OCysteine, D*Aspartie Add, 
E=Glutamic Acid, F^Pbenylalanine, G=Glyrine, H=Histidine, 
f^Isoiendne, K«Lysine, L»Leudne, M^tlethionine, 
I^Asparagine, P^Proline, Q=Glutamine, R=Arginhie, S=Serint, 
•^Threonine, V=Valine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *«Stop codon, /-possible nucleotide deletion, 
V=possibIe nudeotide insertion 










NRTM11HNGMFFSTYDRDNDGWLTSDPRKQCSK 
EIXXJGWWYNRCHAANPNGRYYWGGQYTWDM 
AKHGTDDGWWMNWKGSWYSMKKMSMKIRP 
FFPQQ 


3122 


A 


3 


1490 


HASGPTRPVSWSFHKIJCTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYroETVNSNTI^NLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPWSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVXAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDIX3WWMNWKGSWYSMKKMSMKIRP 

FFPQQ 


3123 

■ 

* 


A 

* ' • 


3 


1490 


HASGPTRPVSWSrllKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPWSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVA'H.TDGKNYCGLPGEYWLGNDK 

ISQLTFMGPlELLrfjrviED'VT* "^DK^T-IA'r.T 1FY\ 

QNE- .>;X7QISVNi:CyRGTAGNAIAil.aAJCi--MGE 

NRTMTIHNGMFFSTYDKDNDGWLTST. ?RKQCSK 

EDGGGWWYNRCHAANPNGRYYWGOQ Y rWDM 

AKHGTOIXjVVWMhnVKGSWYSMKKMSKi]^J^ 

FFPQQ 


3124 


A 


3 


544 


RVDDFVLLRSRLALRWLSHVRRPSRRVPRMPRG 

SRSRTSRMAPPASRAPQMRAAPRPAPVAQPPAA 

APPSAVGSSAAAPRQPGLMAQMATTAAGVAVG 

SAVGHTLGHAITGGFSGGSNAEPARPDITYQEPQ 

GTQPAQQQQPCLYEIKQFLECAQNQGDIKLCEGF 

NEVLKQCRLANGLA 


3125 


A 


3 


571 


GNSYNHRSLAAYPYMSHSQHSPYLQSYHNSSAA 

AQTRGDDTDQQKTTVIENGEIRFNGKGKKIRKPR 

TTYSSLQI^ALNHRFQQTQYLALPERAELAASLG 

LTQTQVK1WFQNKRSKFKKLLKQGSNPHESDPL 

QGSAALSPRSPALPPVWDVSASAKGVSMPPNSY 

MPGYSHWYSSPHQDTMQRPQMM 


3126 


A 


43 


5377 


LSVPFP1PVDGRDRGSNPSLESTSSELSTSTSEGSL 

SAMSGRNELHSRLHPHPQSSLIPMMFSPPESLLAS 

UlLKu IN r Ali AHQ VLKrFNLKS SPSSGELMFMER Y 

QEVIQELAQVEHKIENQNSDAGSSTIRRTGSGRST 

LQAIGSAAAAGMVFYSISDVTDKLLNTSGDPIPM 

LQEDFWISTALVEPTAPLREVLEDLSFPAMAAFD 

LACSQCQLWKTCKQLLETAERRLNSSLERRGRRI 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine OCysteine, D=Aspartie Add, 
E=GIutamic Add, ^Phenylalanine, OGIycine, H=Histidine, 
I=Isoleudne, K=Lysine, LHLendne, M=M ethiouine, 
N=Asparagine, P=Pro)ine, Q=GIutan»ne, R«Arginine, S=*Scrioe, 
T=Threonine, V^Valine, W«Tryptophan, Y»Tyrosine, 
X«Unknown, *=Stop codon, possible nudeotide deletion, 
\=possible nudeotide insertion 










dhvllnadgirgfpwlqqiskslnyllmsasqt 

ksesveekgggpprcsitellqmcwpslsedcva 

shttlsqqldqvlqslrealelpeprtpplsslve 

qaaqkapeaeahpvqiqtqllqknlgkqtpsgs 

rqmdylgtffsycstlaavllqslssepdhvevk 

vgnpfvllqqsssqlvshllferqvpperlaall 

aqenlslsvpqvivsccceplalcssrqsqqtssl 

ltrlgti^qlhashclddu>lstpssprttenptl 

erkpyssprdsslpaltssalaflksrskllatva 

clgasprijkvskpslswkelrgrrevplaaeqv 

arecerlleqfplfeafllaaweplrgslqqgqs 

iavnu:gwaslstvllglhspialdvlseafees 

lvardwsralqltevygrdvddlssikdavlsc 

avacdkegwqylfpvkdaslrsrlalqfvdrw 

plescleilaycisdtavqeglkcelqrklaelq 

vyqkiixilqsppvwcdwqtuisccvedpstvmn 

mileaqeyelceewgclypiprehlislhqkhll 

hllerrdhdkalqixrripdptmclevteqsldq 

htslatshf1anyltthfygqltavrhreiqaly 

vgskjlltlpeqhrasyshlssnplfmleqixmn 

mkvdwatvavqtlqqllvgqeigftmdevdsl 

lsryaekaldfpypqrekrsdsvihlqervhqaa 

dpetlprspsaefspaappgissihspslrersfppt 

qpsqefvppatpparhqwvpdetesicmvccreh 

ftmfnrrhhcrrcgrlvcsscstkkmvvegcre 

nparvcdqcysycnkdvpeepsekpealdsskse 

sppysfwrvpkadevewildlkeeenelvrsef 

yyeqapsaslcia1lnlhrdsiacghqliehccrl 

skgltnpevdaglltdimkqllfsakmmfvkag 

qsqdlalcdsyiskvdvlnilva aayrhvpsldq 

ILQPA/YTRL^MQLLFAEYYQLv r, rrv'STKT"'LDT 

TGAWHAWGrMCL*\* :~>TLTAAI\EKFSRCLKPPF 1 

DLNQLNHGSRLVQDVVEYLESTVRPFVSLQDDD 

YFATLRELEATLRTQSLSLAVPEGKIMNNTYYQ 

ECLFYIJHKYSTNLAnSFYVRHSCL^ 

ESPPEVFIEG^QPSYKSGKLHTLENLLESIDPTLES 

WGKYLIAACQHLQKKNYYHILYELQQFMKDQV 

RAAMTCIRFFSHKAKSYTELGEKLSWLLKAKDH 

LKIYLQETSRSSGRKKriFFRKKMTAADVSRHM 

NTU?LQMEVTRFUmCESAGTSQITTLPLPTLFG 

NNHMKMDVACKVMUJGKNVEDGFGIAFRVLQ 

DFQLDAAMTYCRAARQLVEKEKYSEIQQLLKCV 

SESGMAAKSDGDTBLLNCLEAFKRPPQCCFCSA 

QELEGLIQAIHNDDNKVRAYLICCKLRSAYLIAV 

KQEHSRATALVQQVQQAAKSSGDAWQDICAQ 

WLLTSHPRGAHGPGSRK 


3127 


A 


467 


1259 


HLGPPLAWIPAASLTSTKGEFGVEDDRPARGPPP 

PKSEEASWSESGVSSSSGDGPFAGGEVDKRLHQL 

KTQLATLTSSLATVTQEKSRMEASYLADKKKMK 

QDLEDASNKAEEERARLEGELKGLQEQIAETKA 

RLITQQHDRAQEQSDHALMLRELQKLLQEERTQ 

RQDLELRLEETREALAGRAYAAEQMEGFELQTK 

QLTREVEELKSELQAIRDEKNQPDPRLQELQEEA 

ARLKSHFQAQLQQEMRKVIIfflSFKHQPLT , 


3128 


A 


1854 


798 


ASGSPAPSSSSAMAAACGPGAAGYCLLLGLHLFL 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
locatfoo 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystdne, D=Aspartic Acid, 
E=G!utamIc Add, F-Pheoylalantae, G=G!ydne, H=Hlstidlne, 
I^lsoJeuclne, KHLysine, L»Leodne, MNMetfuonine, 
N^Asparagine, F^hroiine, Q=GlutamiDt, R^Arginine, S^Serine, 
Threonine, V«Valine, W^Tryptophao, Y°Tyrosine, 
X=Unknown, *=Stop codon, ^possible nudeotfde deletion, 
Vpossible nndeotide Insertion 










LTAGPALGWNDPDRMlXRDVKALTLrlYDRYTT 

SRM,DPII^UCCVGGTAGOTSYTPKVIQCX}NKG 

WIXjYDVQWECKTDLDIAYKFGKTVVSCEGYES 

SEDQYVLRGSCGLEYNLDYTELGLQKLKESGKQ 

HGFASFSDYYYKWSSADSCNMSGUTTVVLLGIA 

r V V I KLr L.aL>uv£ i orrr Yofc I rrr oHKY v^Kr I JNc> 

AGPPPPGFKSEFTGPQNTGHGATSGFGSAFTGQQ 

GYENSGPGFWTGLGTGGELGYLFGSNRAATPFSD 

SWYYPSYPPSYPGTWNRAYSPLHGGSGSYSVCS 

NSDTKTRTASGYGGTRRR 


3129 


A 


2340 


1192 


EIARRPKQQSSEKSR^MIRNWLTIFILFPLKLVEK 

CESSVSLTVPPVVKl^GSSTbTVSLTLRPPLNATL 

VrrFEITFRSKNrnLELPDEVVVPPGVTNSSFQVT 

SQNVGQLTVYLHGNHSNQTGPRIRFLVIRSSAISn 

NQVIGWIYFVAWSISFYPQVIMNWRRKSVIGLSF 

DFVAI^TGFVAYSVFNIGLLWVPYEKEQFLLKY 

PNGVNPVNSNDVFFSLHAVVLTLIIIVQCCLYERG 

GQRVSWPMGFLVLAWLFAFVTMrv AAVGVITW 

UJFLFCFSY1KLAVTLVKYFPQAYMNFYYKSTEG 

WSIGNVLIJDFTGGSFSLLQMFLQSYNNDQWTLIF 

GDPTKFGLGVFSIVFDVVFFIQHFCLYRKRPGYD 

QLN 


3130 


A 


31 


2026 


CWWPPLLPQLEPEPPPLRPRVAASQGGGMLGKG 

WGGGGGTKAPKPSFVSYVRPEEIHTNEKEVTEK 

EVTLHLLPGEQLLCEASTVLKYVQEDSCQHGVY 

GRLVCTDFKIAFLGDDESALDNDETQFKNKVIGE 

M>ITLHCVDQIYGWDEKKKTLFGQLKKYPEKLII 

HCKDLRWQFCLRYTKEEEVKRIVSGIIHHTQAP 

KLLKRLFLFSYATAAQNKI^TDPKNHTVMFDTL 

KDWCWELERTKGNMKYKAVSVNEGYKVCERL 

PA^/Vr^L;^JVORFQGHGIFTWCWSr.HNGS 

Y£!'/KTEDI^SNF^LQEIQTAYSKFKQLFLIDNSr 

EF V,T)TOIKWFSlXESSSWLDnRRCLKKAIErrEC 

MEAQ>fivIl> T VlXLEl^ASDLCCLlSSLVQLMMDPH 

CRTRIGFQSLIQKEWVMGGHCFLDRCNHLRQND 

KEEHQRQLSLPLTQSKSSPKRGFFREETDHLIKNL 

lAjiUvJoKLIN SSDrLQDNFRlib YDS WHSKSTDYH 

GLLLPHffiGPEIKVWAQRYLRWIPEAQILGGGQV 

ATI^KLLEMMEEVQSLQEKIDERHHSQQAPQAE 

APCLLRNSARI^SLFPFAIXQRHSSKPVLPTSGW 

KALGDEDDLAKREDEFVDLGDV 


3131 


A 


126 


965 


QSRSRPRREGVGTGSRAVLCILATCGSKMSDIGD 

WFRSIPAITRYWFAATVAVPLVGKLGL1SPAYLF 

LWPEAFLYRFQIWRPITATFYFPVGPGTGFLYLV 

NLYFLYQYSTRI^GAFDGRPADYLFMLIJ^NWI 

CI VI 1 OLAMJJMQLLMirLiJVio VLY V WAvJLNRDM 

IVSFWFGTRFKACYlJ'WVILGFNYnGGSVINELIG 

mVGHLYF^MFRYPMDLGGRNFLSTPQFLYRW 

LPSRRGGVSGFGVPPASMRRAADQNGGGGRHN 


3132 


A 


2 


350 


FVAGWRALTAPSTSARLRAFGWQAAARLLVFG 
ARGVGLGSGAPGSLPCYLRMDALALLGGLVNV 
ARLPERWGPGRTOYWGNSHQIN1HLLSVGSILQL 
HAGWPDLLWAAHHACPRD 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCystdne, D=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=Glydne, R=Hhtidtne T 
I^Isolencine, K^LysIne, L>=Leurine, M=Methionine, 
N^Asparagine, FHProline, Q^Glutamine, R»Argtnine, S=Serine, 
T=Threonlne, V«Valine, W=Tryptophan, Y=Tyroslne, 
X=Un known, *«Stop codon, /^possible ondeotide ddetion, 
V=possiWe nucleotide insertion 


3133 


A 


1 


2921 


MTCFKGQKGEQRSHAFEANKDHKAKVPSPNLYS 

QLNALQFTVDERSILWLNQFLLDLKQSLNQFMA 

VYKLl^NSKSDEHVDWVlXjimKFVIPSEVKS 

ECHQIX^PRAISIQSSEMIAThnilHCPNCRHSDIJEA 

LFQDFKDCDFFSKTYTSFPKSCDNFNLLHPIFQRH 

AHEQDTKMHEIYKGMTP^ 

WAVYFSQFWIDYEGMKSGKGRPISFVDSFPLSIW 

ICQPTRYAESQKEPQTCNQVSLNTSQSESSDLAG 

RLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLF 

LHESLILLSENLRKDVEAVTGSPASQTSICIGILLR 

SAElJU.LLJII^QANTLKSPVSESVSPVVPDYLP 

TENGDFLSSKKKQISRDINRIRSVTVNHMSDNRS 

MSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYL 

SDKHLGKISEDESSGLVYKSGSGEIGSETSDKKDS 

FYTDSSSVLNYREDSNILSFDSDGNQNILSSTLTS 

KGNETIESIFKAEDLIJEAASLSENLDISKEETPPV 

RTLK5QSSI^GKPKERCPPNlJVPU:VSYKNMKRS 

SSQMSLDllSLDSMILEEQLLESDGSDSHMFLEKG 

NKKNSTThT^RGTAESVNAGANLQNYGETSPDAI 

STNSEGAQENHDDLMSWVFKTTGVNGEIDIRGE 

DTEIC1XJVNQVTPDQLGNISLRHYLCNRPVGSDQ 

KAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFL 

QCHIENFSTEFLTSSLMNIQHI^EDETVATVMPM 

KIQVSNTKINLKDDSPRSSTVSLEPAPVTVHIDHL 

WERSDDGSFHIRDSHMLNTGNDLKENVKSDSV 

LLTSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMALAE 

AHLEKDALLHHUCKMTVE 


3134 


A 


9 


1579 


EEEGLSGGGPRVPCSLWGKQTMDYDFKAKLAA 

ERERWXTFEY^qCKVGRGTYC 7 VKARRl^G 

KDEKEYALKQIEGl v : r^JSACRE>V\LLRELKHPN ! 

VIALQKVFLSHSDRKVWLLFDYAEHDLWHIIKFH 

RASKANKKPMQLPRSMVKSLLYQILDGIHYLHA 

NWVUIRDLKPAl^VMGEGPERGRVKIADMGF 

ARLFNSPLKPI^VDLDPVVVTFWYRAPELLLGAR 

HYTKAIDIWAIGCIFAELLTSEPIFHCRQEDIKTSN 

PFHHDQLDRIFSVMGFPADKDWEDIRKMPEYPT 

LQKDFRRTTYANSSLKYMEKHKVKPDSKVFLL 

IXJKLLTMDPTKRITSEQALQDPYFQEDPLPTLDV 

FAGCQ1PYPKREFLNEDDPEEKGDKNQQQQQNQ 

HQQPTAPPQQAAAPPQAPPPQQNSTQTNGTAGG 

AGAGVGGTGAGLQHSQDSSLNQVPPNKKPRLGP 

SGANSGGPVMPSDYQHSSSRLNYQSSVQGSSQS 

QSTLGYSSSSQQSSQYHPSHQAHRY 


3135 


A 


3 


1111 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQ 

LSSRDPPGSLSAKKVRTEEKKAPRRVNGEGGSG 

GNSRQLQPPAAPSPQSYGSPASWSFAPLSAAPSPS 

SSRSSFSFSAGTAVPSSASASLSQPGPRKLLVPPTL 

LHAQPHHLLLPAAAAAASANAKSRRPKEKREKE 

RRRHGLGGAREAGGASREENGEVKPLPRDKIKD 

KIKERDKEKEREKKKHKVMNEIKKENGEVKILL 

KSGKEKPKTNIEDLQIKKVKKKKKKKHKENEKR 

KRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNI 

KDYVGKNLDTKKYDSKIPENSEFPFVSLKEPRVQ 
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Method 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



Amino add sequence (A-Alanine OCysteine, D=vVspartic Add, 
E=Glutamie Add, ^Phenylalanine, G-CIydnt, ENHbtidine, 
I=koleudne, K°Lysine, L^Leudne, M-Methionine, 
N=>Asparagine, P^Proline, Q^GIutamine, R^Arginine, S^Serine, 
T^Tbreonine, V=VaIine, W=*Tryptophan, Y^TyTosine, 
X=l?nknown, *=Stop codon, /^possible oodeotide deletion, 
V-possible nndeotide insertion 



SEQID 
NO: 



Predicted 

beginning 

nndeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



NNIJQfUJDTI^nCQLIHIEHQPNGGASVIHCLQ 



3136 



1442 



682 



TAAMSIFTPTNQIRLTNVAVVRMKRAGKRPEIAC 
YKNKWGWRSGVEKDLDEVLQTOSVFVNVSKG 
QVAKKEDLISAFGTDDQTEICKQILTKGEVQVSD 
KERHTQLEQMFRDIATIVADKCVNPETKRPYTVI 
LffiRAMKDIHYSVKTNKSTKQQALEVIKQLKEK 
MKIERAHMRLRFILPVNEGKKLKEKLKPIJKVIES 
EDYGQQLEIVCLIDPGCFREIDELIKKETKGKGSL 
EVLNLKDVEEGDEKFE 



3137 



3143 



MVEGKRHVLHGGRQERMRAKQKGKPLKSSDL 

VRLIHYHHNSSPLHKQSSGPSSSPAAAAAPEKPG 

PKAAEVGDDFLGDFWGERVWVNGVKPGWQY 

LGETQFAPGQWAGWLDDPVGKNDGAVGGVR 

YFECPALQGIFTRPSKLTRQPTAEGSGSDAHSVES 

LTAQNLSLHSGTATPPLTSRVIPLRESVLNSSVKT 

GNESGSNl^DSGSVKRGEKDlJaGDRVLVGGTK 

TGWRYVGETDFAKGEWCGVELDEPLGKNDGA 

VAGTRYFQCPPKFGLFAPIHKVIRIGFPSTSPAKA 

KKTKRMAMGVSALTHSPSSSSISSVSSVASSVGG 

RPSRSGLLTETSSRYARKISGTTALQEALKEKQQ 

HIEQLLAERDLERAEVAKATSHICEVEKEIALLK 

AQHEQYVAEAEEKLQRARLLVESVRKEKVDLSN 

QLEEERRKVEDLQFRVEEESITKGDLETQTQLEH 

ARIGELEQSLLLEKAQAERLLRELADNRLTTVAE 

KSRVLQLEEELTLRRGEIEELQQCLLHSGPPPPDH 

PDAAEILRLRERLLSASKEHQRESGVLRDKYEKA 

LKAYQAEVDKLRAANEKYAQEVAGLKDKVQQ 

ATSENMGLMDNWKSKLDSLASDHQKSLEDLKA 

TLNSGPGAQQKEIGELKAVMEGIKMEHQLELGN 

LQAKTOLETAMHVKEKEALREKLQEAQEELAG 

LOW "RAOJ FVO ASQHRL^LQr ^^^RHDAIL 

RVHELEKLL . ti i QAQAIErT-KjbQlSIj\EKKML 

DYERLQRAEAQ 'JKQEVESLREKLLVAENRLQAV 

EALCS SQHTHMiil SI ^DISEEHRTKETVEGLQDKL 

NKRDKEVTALTSQTEIviLRAQVSALESKCKSGEK 

KVDAIXKEKRRLEAEIJETVSRKTfflDASGQLVLIS 

QELLRKERSLNELRVLLLEANRHSPGPERDLSRE 

VHKAEWRDCEQKLKDDIRGLREKLTGLDKEKSL 

SDQRRYSLIDPSSAPELLRLQHQLMSTEDALRDA 

LDQAQQVEKLMEAMRSCPDKAQTIGNSGSANGI 

HQQDKAQKQEDKH 



3138 



110 



2499 



QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWrKJEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLIj\NSPLN1EDAPQRLRWQAHLE 

FIHNHDVGDLTWDKIAVSIJ > RSEKLRSLVLAGIP 

HGMRPQLWMRLSGALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAnEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDDELSLITLHWFLTAF 

ASVVDIKLIXR1WDIJFTYEGSRVLFQLTLGMLHL 

KEEELIQSENSAS1PNTLSDIPSQMEDAELLLGVA 

MRIAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTh^QVVRRRTQRRKSTTTALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

oudcotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AIanIne OCystdne, b^Aspartic Add, 
E«Glutamic Add, ^Phenylalanine, OGhydne, HHHistidlne, 
Msoleudne, K-Lysine, L^Leuctae, M=Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T^Tbreonine, V=Valine, W~Tryptopban, Y^Tyrostne, 
X=Unknown f *=Stop codon, £=possible nudeotlde deletion, 
^possible oudcotide insertion 










WSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDEU}FRKNDimVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPAIJCALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQlVflDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 

QPIiCEGVRDMLVKHHLFSWDVDG 


3139 


A 


110 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLB 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLS.GALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRV1JIALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAEEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASVYDIKLLLRIWDLFFYEGSRVLFQLTLGNILHL 

KEEELIQSENSASEFNTLSDEPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTOPKNCS 

WSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDimVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYS1AGDDSVTEGVTDLV 

RGTLCPALKALFEHGIJCKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

TCELLYi*. - .VOSYN* ^TflV ? x TDVKLRSLir*Y 

ULNEQVimWLEVLCSSl l.vii:-: . YQPWSFLRS 

PGWVQKCELRVLCCFAFSi> >DWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWi; Y!>3 


3140 1 


A 


1 


4939 


SAALGASLAIPRPGLPGVHGRGPC ii^GRAMEG 

AEPRARPERLAEAETRAADGGRLVEVQLSGGAP 

WGFTLKGGREHGEPLV1TKIEEGSKAAAVDKLL 

AGDEIVGINDIGLSGFRQEAICLVKGSHKTLKLV 

VKRRSELGWRPHSWHATKFSDSHPELAASPFTST 

SGCPSWSGRHHASSSSrmLSSSWEQTNLQRTLD 

HFSSLGSVDSLDHPSSRLSVAKSNSSIDHLGSHSK 

RDSAYGSFSTSSSTPDHTLSKADTSSAENILYTVG 

LWEAPRQGGRQAQAAGDPQGSEEKLSCFPPRVP 

GDSGKGPRPEYNAEPKLAAPGRSNFGPVWYVPD 

KKKAPSSPPPPPPPLRSDSFAATK^HEKAQGPVFS 

EAAAAQHFTALAQAQPRGDRRPELTDRPWRSAH 

PGSLGKGSGGPGCPQEAHADGSWPPSKDGASSR 

LQASLSSSDVRFPQSPHSGRHPPLYSDHSPLCADS 

LGQEPGAASFQNDSPPQVRGLSSCDQKLGSGWQ 

GPRPCVQGDLQAAQLWAGCWPSDTALGALESL 

PPPTVGQSPRHHLPQPEGPPDARETGRCYPLDKG 

AEGCSAGAQEPPRASRAEKASQRLAASITWADG 

ESSRICPQETPLLHSLTQEGKRRPESSPEDSATRPP 

PFDAHVGKPTRRSDRFATTLRNEIQMHRAKLQK 

SRSTVALTAAGEAEDGTGRWRAGLGGGTQEGPL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystdne, D^Asparttc Add, 
£>=GIatamic Add, ^Phenylalanine, G=G lydnc, H^Histidine, 
I=Is6leurine, KNLyslne, I^Lendne, M^Metbionine, 
N=»Asparagine, P^Proline, QKJlutemine, R«Arginine, S^Serine, 
T=Threonlne, V=Vallce, W=Tryptophan, Y=Tyroslne, 
X^Unknown, *=Stop codon, /=p03sib!t nudeotide deletion, 
V*possible nudeotide insertion 










AGTYKDHLKEAQARVLRATSFKRRDLDPNPGDL 

YPESLEHRMGDPDTVPHFWEAGLAQPPSSTSGGP 

HPPRIGGRRRFTAEQKLKSYSEPEKMNEVGLTRG 

YSPHQHPRTSEDTVGTFADRWKFFEETSKPVPQR 

PAQKQALHGIPRDKPERPRTAGRTCEGTEPWSRT 

TSLGDSLNAHSAAEKAGTSDLPRRLGTFAEYQAS 

WKEQRKPLEARSSGRCHSADDILDVSLDPQERPQ 

HVHGRSRSSPSTDHYKQEASVELRRQAGDPGEP 

REELPSAVRAEEGQSTPRQADAQCREGSPGSQQ 

HPPSQKAPNPPTFSELSHCRGAPELPREGRGRAG 

TLPRDYRYSEESTPADLGPRAQSPGSPLHARGQD 

SWPVSSALLSKRPAPQRPPPPKREPRRYRATDGA 

PADAPVGVLGRPFPTPSPASLDVYVARLSLSHSPS 

VFSSAQPQDTPKATVCERGSQHVSGDASRPLPEA 

IXPPKQQrILRLQTATMETSRSPSPQFAPQKLTDK 

PPLUQDEDSTrUERVMDNNTTVKMVPIKIVHSES 

QPEKESRQSLACPAEPPALPHGLEKDQIKTLSTSE 

QFYSRFCLYTRQGAEPEAPHRAQPAEPQPLGTQV 

PPEKDRCTSPPGLSYMKAKEKTVEDLKSEELARE 

IVGKDKSLADILDPSVKIKTTMDLMEGIFPKDEH 

LLEEAQQRRKLLPKIPSPRSTEERKEEPSVPAAVS 

LATNSTYYSTSAPKAELLIKMKDLQEQQEHEEDS 

GSDLDHDLSVKKQELIESISRKLQVLREARESLLE 

DVQANTVLGAEVEAIVKGVCKPSEFDKFRMFIG 

DLDKVVNLLLSLSGRLARVENALNNLDDGASPG 

DRQSLLEKQRVLIQQHEDAKELKENLDRRERIVF 

DILANYLSEESLADYEHFVKMKSALIIEQRELED 

KIHLGEEQLKCLLDSLQPERGK 


3141 

1 


A 


97 


1894 


SPRGATMETPPLPPACTKQGHQKPLDSKDDNTE 

KHCPVIVNPWHMKKAFKVMNELRSQNLLCDVT 

T^'AEDMEIt A FTRA'VLAACSPYFHAN. 1 ; \* rniSMSEf r 1 

AKRVRIKEVDG^ TLRI* 1 " Tj *'VYTAI:!QVTEENV 

QVLLPAAGLLQLQDVKKTCCEFLESQLHPVNCL 

GIRAFADMHACTDLLNKANTYAEQHFADVVLSE 

EFLNLGffiQVCSLISSDKLTISSEEKVFEAVIAWV 

NHDKDVRQEFMARLMEHVRLPLLPREYLVQRV 

EEEALVKNSSACKNYLIEAMKYHLLPTEQRILMK 

SWTRLRTPMNLPKLMVWGGQAPKAIRSAECY 

DFKEQRWHQVAELPSRRCRAGMVYLAGLVFAV 

GGFNGSLRVRTVDSYDPVKIXJWTSVANMRDRR 

STLGAAVLNGLLYAVGGFDGSTGLSSVEAYNIKS 

NEWFHVAPMNTRRSSVGVGVVGGLLY A VGG YD 

GASRQYI^TVECYNATTNEWTYIAEMSTRRSGA 

GVGVLNNLLYAVGGHDGPLVRKSVEVYDPTTN 

AWRQVADMNMCRRNAGVCAVNGLLYWGGD 

1X}SCMj\SVEYYNPTTDKWTWSSCMSTGRSYA 

GVTVIDKPL 


3142 


A 


1211 


1311 


FSNLTTEKVAHAKEENLSMHQMLDQTLLELNN 
M 


3143 


A 


1809 


1041 


SEELDREKKLKEDSPRKTPNKESGVPSLPVSLTSI 

KEEPKEAKHPDSQSMEESKLKNDDRKTPVNWK 

DSRGTRVAVSSPMSQHQSYIQYLHAYPYPQMYD 

PSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYGK 

MSGREETEKVOTSPSVNTKTITESKALDLLQQH 

ANQYRSKSPAPVEKATAEREREAERERDRHSPFG 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine OCystdne, 0=Aspartic Add, 
fc>=Glutamic Add, Phenylalanine, OKJIydne, H»Hbtidioe, 
I=I$oIeudne, K«Lysint, t^Leudne, M=Methionine, 
N=*Asparagine, rVProline, Q=€Iu taurine, R=Arginine, S=Serine, 
T^Threonlne, V^Valine, W»Tryptophan, Y*aYrosine, 

FT t -fr O t j_ w* aajIah t . L__n_j.jilt_l_ n m a! a J a jI^IaaIa ^ 

A = UDtuiown t ^fop coaon, /^possiDie nuaeoudc ddenon, 
V=possible nudeotide insertion 










QRHUmfflHTHVGMGYPLIPGQ YDPFQGLTSAA 
LVASQQVAAQASASGMFPGQRR 


3144 


A 


78 


604 


SVSGR^DLI^YLiHT^SNMNLDGSAQDPEKREYS 

SVCVGREDDIKKSERMTAVVHDREVVIFYHKGE 

YHAMDIRCYHSGGPLHLGDIEDFDGRPCIVCPW 

HKYKITLATGEGLYQS1NPKDPSAKPKWCSKGIK 

QRIinVIVDNGNIYVTLSNEPFKCDSDFYATGDF 

KVIKSSS 


3145 


A 


2 


333 


RNSLLLPPLHLDNSTPAKMSCQQNQQQCQPPPK 
CPSPKCPPKSPVQCLPPASSGCAPSSGGCGPSSEG 
GCFLNHHRRHHRCRRQRPNSCDRGSGQQGGGS 
GCGHGSGGCC 


3146 


A 


3 


1151 


VCTALQEFGTRSTLLRCLDSGFRPGASRGLVGSW 

AAMESTLGAGIVIAEALQNQLAWLENVWLWITF 

LGDPKILFLFYFPAAYYASRRVGIAVLWISLITEW 

LNLIFKWFLFGDRPFWWVHESGYYSQAPAQVHQ 

FPSSCETGPGSPSGHCMITGAALWPIMTALSSQV 

ATRARSRWVRVMPSLAYCTFLLAVGLSRIFILAH 

FPHQVl^GLITGAVLGWLMTPRVPNIERELSFYG 

LTAIj\miX}TSLIYWTIJFT^ 

CERPEWIHVDSRPFASLSRDSGAALGLGIALHSPC 

YAQVRRAQLGNGQKIACLVLAMGLLGPLDWLG 

HPPQISLFYIF^KYTLWPCLVIALVPWAVHMF 

SAQEAPPIHSS 


3147 


A 


1437 


594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMV AA 

ALGGHPLLGVSATLNSVLNSNAIKNLPPPLGGAA 

GHPGSAVSAAPGILYPGGNKYQTIDNYQPYPCAE 

DEECGTDEYCASPTRGGDAGVQICLACRKRRKR 

CMRHAMCCPGNYCKNGICVSSDQNHFRGECEETI 

TESFGNDHSTLDGYSRRTTLSSKMYHTKGQEGS 

va^ss-^CARCi" ^vHfwsiiicktv? t 'egqvc 

TKHRRKGSHCi i Hit vivCYCGEGLSCRIQKDHHQ 
ASNSSRLHTCQ^a 


3148 


A 


1 


1562 


MSILYDIRAHKAQLL 7.FFASSDSNKALEQRRTLH 

TPKLEHIJDRVLYEWFLGICRSEGVPVSGPMLIEK 

AKDFYEQMQLTEPCVFSGGWLWRFKARHGIKK 

LDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 

EQVYNADETGLFWRCLPWPTPEGGAVPGPKQGK 

DRLTVLMCANATGSHRLKPLAIGKCSGPRAFKGI 

QHLPVAYKAQGNAWVDKEIFSDWFHHIFVPSVR 

EHFRTIGLPEDSKAVLLLDSSRAHPQEAELVSSN 

VFTIFLPASVASLVQPMEQGIRRDFMR2N1FINPPVP 

LQGPHARYNMNDAIFSVACAWNAVPSHVFRRA 

WRKLWPSVAFAEGSSSEEEIJBAEOTVKPHNKSF 

AHELELVKEGSSCPGQLRQRQAASWGVAGREAE 

GGRFPAATSPAEWWSSEKTPKADQDGRGDPGE 

GEEVAWEQAAVAFDAVLRFAERQPCFSAQEVG 

Q1JULRAVFRSQQQVRRRRGALGAVVKVEALQ 

EGPGGCGATAQSPLPCSSTAGDN 


3149 


A 


132 


4125 


VAVMISTAPLYSGVHNWTSSDRIRMCGINEERRA 

PLSDEESTTGDCQHFGSQEFCVSSSFSKVELTAV 

GSGSNARGADPDGSATEKLGHKSEDKPDDPQPK 

MDYAGNVAEAEGLLVPLSSPGDGLKLPASDSAE 

ASNSRAIX^SWTPLNTQMSKQVDCSPAGVKALDS 

RQGVGEKNTFII^TLGTGWVEGTLPLVTTNFSP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nacleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqaence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
seqaence 


Amino acid sequence (A=AJanine OCysteine, D=*Aspartic Acid, 
E=€lutamic Add, ^^Phenylalanine, OGr/dne, H=Histidine, 
JNIsoleudne, K«Lysioe, L^Leudnt, M=Methionlnc, 
N=Asparagine, P«Proline, Q^Ghitamine, R=Arginine» S=Serinc, 
T=Threonine, V=*Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nudeotide ddetion, 
V=possible nudeotide insertion 










LPAPICPPAPSSASVPHSVPDAFQAPVPPSAPTLVL 

APVFIPVLAPMPASTIYAAPAPPSVPMPTPTPSSG 

PPSTPTLIPAFAPTPWAPTPAPIFTPAPTPMPAATP 

AABPTSAPIPASFSLSRVCFPAAQAPAMQKVPLSF 

QPGTVLTPSQPLVYIPPPSCGQPLSVATLPTTLGV 

SSTLTLPVLPSYLQDRCLPGVLASPELRSYPYAFS 

VARPLTSDSKLVSLEVNRLPCTSPSGSTTTQPAPD 

GVPGPLADTSLVTASAKVLPTPQPLLPAPSGSSAP 

PHPAKMPSGTEQQTEGTSVTFSPLKSPPQLEREM 

ASPPECSEMPLDLSSKSNRQKLPLPNQRKTPPMP 

VLTPVHTSSKALLSTVLSRSQRTTQAAGGNVTSC 

LGSTSSPFVIFPEIVRNGDPSTWVKNSTALISTIPG 

TYVGVANPWASIXLNKJDPNLGLNRDPRHLPKQ 

EPISHDQGEPKGTGATCGKKGSQAGAEGQPSTV 

KRYTPARIAPGLPGCQTKEI^LWKPTGPANIYPR 

CSVNGKPTSTQVLPVGWSPYHQASLLSIGISSAG 

QLTPSQGAPIRPTSWSEFSGVPSLSSSEAVHGLP 

EGQPRPGGSFVPEQDPVTKNKTCRIAAKPYEEQV 

NPVLLTLSPQTGTLALSVQPSGGDIRMNQGPEES 

ESHLCSDSTPKMEGPQGACGLKLAGDTKPKNQV 

LATYMSHELVLATPQNLPKMPELPLLPHDSHPKE 

LEJ5VVPSSRRGSSTERPQLGSQVDLGRVKMEKV 

DGDWFNLATCFRADGLPVAPQRGQAEVRAKA 

GQARVKQESVGVFACKNKWQPDDVTESLPPKK 

MKCGKEKDSEEQQLQPQAKAWRSSHRPKCRK 

LPSDI^ESTKKSPRGASDSGKEHNGVRGKHKHR 

KPTKPESQSPGKRADSHEEGSLEKKAKSSFRDFIP 

VVLSTRTRSQSDLKARKQKTSSSQSLEHRLRNRN 

LLLPNKVQGISDSPNGFLPNNLEEPACLENSEKPS 

GKRKCKTKHMATVSEEAKGKGRWSQQKTRSPK 

SPTPVKrTTE" PSKSRf \SSE^ a ^T^ARQ! 

^ILIVNKK AGETLLQRAARU 'Y1X<- vXYCLQK 
DSEDVNHRDNAGYTALHEACS^ WTDELNILLE 
HGA 


3150 


A 


3 


2795 


SLRMHh^ILVRQIKFYYQETLQQL^/iSLPNVLI 

IGKOTFSEQGTEEVKK1J,IXIJXK^AV<^QKKEEF 

IERIQGLDFDTKAAVAAHIQEVTHNQENVFDLQ 

WMEVTDMSQEDIEPIJLKNMALHLKRLIDERDEH 

SETIIELSEERDGLHFLPHASSSAQSPCGSPGMKR 

TESRQHI^VELADAKAKIRRLRQELEEKTEQLLD 

CKQELEQMEIELKRLQQENMNLLSDARSARMYR 

DELDALREKAVRVDKLESEVSRYKERLHDIEFY 

KARVEEIJCEDNQVLLETKTMLEDQLEGTRARSD 

KLHEl^KENLQLKAKLHDMEMERDMDRJCKIEE 

LMEENNTILEMAQKQSMDESLHLGWELEQISRTS 

ELSEAPQKSLGHEVNELTSSRLLKLEMENQSLTK 

TVEEIJITTVDSVEGNASKILKMEKENQRLSKKV 

EILENEIVQEKQSLQNCQNLSKDLMKEKAQLEKT 

IETLRENSERQDCILEQENEHLNQTVSSLRQRSQIS 

AEARVKDIEKENKILHESIKETSSKLSKIEFEKRQI 

WT?T •pi_r\/T/T?.T/ r> r? I > A I ' 1 »T ttvIPI TTTJT T7TT T7 VTT7T T f\V 

KK£LEHYK£KGERAEELENELHHI .KKKNEIXQK 
KITNLKITCEK1EA1JEQENSEIJERENRKIJCKTLDS 
FKNLTFQI^LEKENSQLDEENLEIJIRNVESLKC 
ASMKMAQLQLENKELESEKEQLKKGLELLKASF 
KKTERUEVSYQGIJ>IENQRLQK1LENSNKKIQQL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCystdne, D=Aspartic Add, 
E-GJutomlc Add, ^-Phenylalanine, G-Glydne, H=Histidlne, 
l^Isoleudne, K«LysIne, L^Lendne, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R»Arginine, S=Serine, 
'MTireonine, V»VaUne, W=Tryptophan, Y^Tyrosint, 
X=Unknowu, *=Stop codon, ^possible nucleotide deletion, 
V=possib]e nudeodde Insertion 










ESELQDl^MENQTLQKNLEELKISSKRLEQLEKE 

NKSLEQETSQLEKDKKQLEKENKRLRQQAEIKD 

TTLEENNVKIGNL£KENKTLSKHGIYKESCVRLB 

EIJEKENKELVKRATTOIKTLVTLREDLVSEKLKT 

QQMNNDLEKLTHELEKIGLNKERLLHDEQSTDD 

SRYKIXESKI^TLKKSLEIKEEKIAALEARLEES 

TbTYNQQLRQEIiCTVKKK 


3151 


A- 


2 


2515 


GFWLHLTLLGASLPAALGWMDPGTSRGPDVGV 

GESQAEEPRSFEVTRREGLSSHNELLASCGKKFC 

SRGSRCVLSRKTGEPECQCLEACRPSYWVCGSD 

GRFYENHCKLHRAACLLGKRITVIHSKDCFLKGD 

TCTMAGYARLKNVLLALQTRLQPLQEGDSRQDP 

ASQKRLLVESLFRDLDADGNGHLSSSELAQHVL 

KKQDIJDEDIXGCSPGDLUUT>DYNSDSSLTLREF 

YMAFQWQLSI^EDRVSVTTVTVGLSTVLTCA 

VHGDLRPPIIWKRNGLTLNFLDLEDINDFGEDDS 

LYITKVTTIHMGNYTCHASGHEQLFQTHVLQVN 

VPPVIRVYPESQAQEPGVAASLRCHAEGIPMPRIT 

WLKNGVDVSTQMSKQLSLLANGSELfflSSVRYE 

DTGAYTCIAKNEVGVDED1SSLFIEDSARKTLANI 

LWREEGl^VG>nVIFYWSDDGIIVIHPVDCEIQRH 

LKPTEKIFMSYEEICPQREKNATQPCQWVSAVNV 

RNRYIYVAQPALSRVLWDIQAHKVLQSIGVDPL 

PAKLSYDKSHDQVWVLSWGDVHKSRPSLQVTTE 

ASTGQSQHLIRTPFAGVDDFFIPPTNLIINHIRFGFI 

FNKSDPAVHKVDLETMMPLKTIGLIfflHGCVPQA 

MAHTHLGGYFFIQCRQDSPASAARQLLVDSVTD 

SVLGPNGDVTGTPHTSPDGRFTVSAAADSPWLHV 

QEITVRGEIQTLYDLQINSGISDLAFQRSFTESNQ 

YNTYAALHlEPDLIiTLELSTGKVGMLKNIJ^ 

G?AQr^ r GO'niRIMRr > SOLI 7 ^YLLTPARESLFiV 

NGRQN1 LRCEVSGIKGG TTV VWV j 7 


3152 


A 


1 


2645 : 

"* 


GAGWQVSLTGRWSPGREAGAGEVKQDPGSTAA 

SPSSCDADLSARMARGERRRRAVPAEGVRTAER 

AARGGPGRRDGRGGGPRSTAGGVALAWVLSL 

AI/jMSGRWVLAWYRARRAVTLHSAPAVLPADS 

SSPAVAPDIJWGTYRPrWYFGMKTRSPKPLLTG 

LMWAQQGTTPGTPKLRHTCEQGDGVGPYGWEF 

HDGI^FGRQfflQDGAIJlLriEFVKRPGGQHGGD 

WSWRVTVEPQDSGTSALPLVSLFFYWTDGKEV 

LLPEVGAKGQLKFISGHTSELGDFRFTLLPPTSPG 

DTAPKYGSYNVFWTSNPGLPLLTEMVKSRLNSW 

FQHRPPGASPERYLGLPGSLKWEDRGPSGQGQG 

QFLI(^ VTLKIPISIEFVFESG SAQ AGGNQ ALPRL A 

GSLLTQALESHAEGFRERFEKTFQLKEKGLSSGE 

QVLGQAALSGLLGGIGYFYGQGLVLPDIGVEGSE 

QKVDPALFPPVPLFTAVPSRSFFPRGFLWDEGFH 

QLWQRWDPSLTREALGHWLGLLNADGWIGRE 

Q1LGDEARARWPEFLVQRAVHANPFTLLLPVAH 

MLEVGDPDDLAFLRKALPRLHAWFSWLHQSQA 

GPIJ'LSYRWRGRDPALPTLLNPKTLPSGLDDYPR 

ASHPSVTERHLDLRCWVALGARVLTRLAEHLGE 

AEVAAELGPLAASLEAAESLDELHWAPELGVFA 

DFGNHTKAVQLKPRPPQGLVRVVGRPQPQLQYV 

DALGYVSLFPLLLRLLDPTSSRLGPLLDILADSRH 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-AlanJne OCystdne, D=Aspartic Add, 
E^Glutamic Add, ^Phenylalanine, G»dydne, H=Histidine, 
I^Isoleudne, K=Lysine, LHLeudne, M=Methionine, 
N=Asparaglne, P»=ProUne, Q=Giutamlne, R=ArgInine, S=Serine, 
T-Threonlne, V=VaIine, W^Tryptophao, Y=Tyrosine, 
X^nknown, *=Stop codon, ^possible nudeotide ddetion, 
V=possible nudeotide insertion 










LWSPFGLRSLAASSSFYGQRNSEHDPPYWRGAV 
WLNVNYLALO ALHHY GHLEGPHQ ARAAKLHGE 
LRANWGNVWRQYQATGFLWEQYSDRDGRGM 
GCRPFHGWTSLVLLAMAEDY 


3153 

• 


A 


1 


4312 


MVDCTDELPAAAPADSAREHGSQAGGKGRPGAA 

AVIXADLERDARQGECALPGAAMAGLAPLKPE 

ASRSSSPGPTGCIRARVAAEAGTRNPGNAGAELE 

SWLPCCHGHPETPEPRGGQLPTAPELPSVMLLNG 

DCPESLKKEAAAAEPPRENGLDEAGPGDETTGQ 

EVTVaQDTGFSVKILAPGIEPFSLQVSPQEMVQEIH 

QVLMDREDTCHRTCFSLHLDGNVLDHFSELRSV 

EGLQEGSVLRWEEPYTVREARIHVRHVRDLLKS 

LDPSDAFNGVDCNSLSFLSVFTDGDLGDSGKRK 

KGLEMDPIDCTPPEYILPGSRERPLCPLQPQNRD 

WKPLQCLKVLTN1SGWNPPPGNRKMHGDLN1YLF 

VITAEDRQVSITASTRGFYLNQSTAYHFNPKPASP 

RFI^HSLVELLNQISPTFKKNFAVLQKKRVQRHP 

FERIATPFQVYSWTAPQAEHAMDCVRAEDAYTS 

RLGYEEHIPGQTRDWNEELQTTRELPRKNLPERL 

LRERAIFXVHSDFTAAATRGAMAVE)GNVMAIN 

PSEETKMQMFIWNNIFFSLGFDVRDHYKDFGGD 

VAAYVAPTNDLNGVRTYNAVDVEGLYTLGTVV 

VDYRGYRVTAQSIIPGILERDQEQSVIYGSIDFGK 

TWSHPRYLELLERTSRPLKILRHQVLNDRDEEV 

ELCSSVECKGIIGNDGRHYBLDLLRTFPPDLNFLP 

VPGEELPEECARAGFPRAHRHKLCCLRQELVDA 

FVEHRYLLFMKLAALQLMQQNASQLETPSSLEN 

GGPSSLESKSEDPPGQEAGSEEEGSSASGLAKVK 

ELAETIAADDGTDPRSREVIRNACKAVGSISSTAF 

DIRFNPDIFSPGVRFPESCQDEVRDQKQLLKDAA 

AFT.LSCQIFG- ^a*^re?AVL*V^ V^AEVm 

Q,; ^L^WYLGKVLEL\nJlSPA: J»gi^;. -/FK1GIG 

EUTRSAKHIFKTYLQGVELSGLSA JSHFLNCFLS 

SYPNPVAHLPADELVSKKRNKRRKi'TR?PGAADN 

TAWAVMTPQELWKMCQEAKNYFDFDLKCETV 

IXJAVETYGLQKITLLREISLKTGIQVLLKEYSFDS 

RHKPAl'^KI^VLNIFPVVKHVOTKASDAFHFFQS 

GQAKVQQGFLKEGCELINEALNLFNNVYGAMH 

VETCACLRIXARUmMGDYAEALSNQQKAVL 

MSERVMGTEHPNTIQEYMHLALYCFASSQLSTA 

LSLLYRARYLMLLVFGEDHPEMALLDNNIGLVL 

HGVMEYDLSLRFLENALAVSTKYHGPKALKVAL 

SHHLVARVYESKAEFRSALQHEKEGYTIYKTQL 

GEDHEKTKESSEYLKCLTQQAVALQRTMNEIYR 

NGSSANIPPLKFTAPSMASVLEQLNVINGILFIPLS 

QKDIJENIJCAEVARRHQIXJEASRNRDRAEEPMA 

TEPAPAGAPGDLGSQPPAAKDPSPSVQG 


3154 


A 


416 


4082 


KFKLIKIMLLTLIILLPVVSKFSFVSLSAPQHWSCP 

EGTLAGNGNSTCV GPAPFLIFSHGNSIFRIDTEGT 

NYEQLVVDAGVSVTMDFHYNEKRIYWVDLERQ 

IXQRVFIJN[GSRQERVQ*nEKNVSGMAIKWINEEV 

IWSNQQEGIITVTDMKGmSHIlJL^ALKYPANVA 

VDPVERFBFWSSEVAGSLYRADLDGVGVKALLE 

TSEKITAVSLDVLDKRLFWIQYNREGSNSLICSCD 

YDGGSVHISKHPTQHNLFAMSLFGDR1FYSTWK 
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SEQ n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AIanlne OCystdne, B=Aspartic Add, 
E==Glutamic Add, ^Phenylalanine, OGIydne, ENHistidine, 
I^Isokudne, K=Lysine, LNLeudne, M^Metbioalne, 
N=Asparagine, P^Proline, Q=GIutemine, R-Arglnlne, S==S trine, 
T=ThreonIne, V«Valine, W-Tryptoptaan, Y^Tyrosine, 
X«lfnknown, *=Stop codon, /=possible nudeotide deletion, 
V s possibte nudeotide insertion 




- 






NIKT1 WIANKETTGKDMVRINLHS SF VPLGELKW 

HPLAQPKAEDDTWEPEQKLCKLRKGNCSSTVCG 

QDLQSHLCMCAEGYALSRDRKYCEGNDWKYCE 

DVNECAFWNHGCTLGCKNTPGSYYCTCPVGFVL 

LPDGKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPLSP 

VSWECDCFPGYDLQLDEKSCAASGPQPFLLFANS 

QDIRHMHFDGTDYGTLLSQQMGMVYALDHDPV 

ENKIYFAHTALKWIERANMDGSQRERLIEEGVD 

VPEGLAVDWIGRRFYWTDRGKSLIGRSDLNGKR 

SKnTIENlSQPRGIAVHPMAKRIJWTDTGINPRIE 

SSSI^1X}RLVIASSDLIWPSGITIDFLTDKLYWC 

DAKQSVDEMANLDGSKRRRLTQNDVGHPFAVA 

VFEDYVWFSDWAMPSVmVNKRTGKDRVRLQG 

SMLKPSSLWVHPLAKPGADPCLYQNGGCEHIC 

KKRIXjTAWCSQIEGFMKASIXjKTCLALDGHQL 

LAGGEVDLKNQVTPLDILSKTRVSEDNITESQHM 

LVAEIMVSDQDDCAPVGCSMYARCISEGEDATC 

QCLKGFAGDGKLCSDIDECEMGVPVCPPASSKCI 

NTEGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPDSTP 

PPHLREDDHHYSVRNSDSECPLSHDGYCLHDGV 

CMYTEALDKYACNCVVGYIGERCQYRDLKWWE 

UUIAGHGQQQKVIWAVCVWLVMLLLLSLWG 

AHYYRTQKLLSKNPKNPYEESSRDVRSRRPADT 

EDGMSSCPQPWFVVIKEHQDLKNGGQPVAGED 

GQAADGSMQPTSWRQEPQLCGMGTEQGCWIPV 

SSDKGSCPQVMERSFHMPSYGTQTLEGGVEKPH 

SLLSANPLWQQRALDPPHQMELTQ 


3155 


A 


533 


212 


GTSGWYWERLAERRGRLWSREEAMATMENKVI 
GALWWT -ALGTv* F A ^YTPTCTVAFr FRQW?0 
FPGVY73Q NXGCCFDLi vKGVPWCFYPNl £D 
VPPEEBCEF 


3156 


A 


2 


1585 


PRVRAAD VAAG AQAVVS AGMAKSNGENGPRAP 

AAGESLSGTRESLAQGPDAATTDELSSLGSDSEA 

NGFAERRIDKFGFIVGSQGAEGALEEVPLEVLRQ 

RESKWLDMLNNWDKWMAKKHKKIRLRCQKGI 

PPSIJIGRAWQYLSGGKVKLQQNPGKFDELDMSP 

GDPKWLDVIERDLHRQFPFHEMFVSRGGHGQQD 

LFRVLKAYTLYRPEEGYCQAQAPIAAVLLMHMP 

AEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCA 

FSRII^WSSVIJRVWDMFFCEGVKIIFRVGLVLLK 

HALGSPEKVKACQGQYETIERLRSLSPKIMQEAF 

LVQEVVELPVTERQIEREHLLQLRRWQETRGELQ 

CRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLD 

APLPGSKAKPKPPKQAQKEQRKQMKGRGQLEKP 

PAPNQAMWAAAGDACPPQHVPPKDSAPKDSAP 

QDLAPQVSAHHRSQESLTSQESEDTYL 


3157 


A 


3 


601 


SSAMGSRSSHAAVIPDGDSIRRETGFSQASLLRLH 

HRFRAUjRNKKGYI^RMDLQQIGAIAVOT 

nESFFPDGSQRVDFTOFVRVLAHFRPVEDEDTET 

QDPKKPEPLNSRRNKUIYAFQLYDLDRDGKISR 

HEMLQVLRLMVGVQVTEEQLENIADRTVQEAD 

EDGIX5AVSFVEFTKSLEKMDVEHKMSIRILK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystdne, D=Aspartk Add, 
E*=Glutamic Add, ^-Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L^Leudne, M=Methtonine, 
N»Asparagine, P^Proiine, Q=Glutamine, R^Arginine, S^Serine, 
Threonine, V«*Vaiine, W»Tryptophan, Y=Tyrosine, 
X«Uuknown, *=Stop codon,/=possible oudeotide ddetion, 
V=possible oudeotide insertion 


3158 


A 


2 


409 


ISSCPHTAYEGSMSTI^NFTQTLEDWRIUFrrYM 
DNWRQNTTAiEQEALQAKVDAE>nnnrVILYLMV 
MGMFSFHVAILVSTVKSKRREHSNDPYHQYIVE 
DWQEKYKSQILNLEESKATIHENIGAAGFKMSP 




A 

A 


3 


410 


PWGAAI^DMGRIUjAQLIAALLVL 
KPSPCQCSRLSPHNRTNCGFPGITSDQCFDNGCCF 
DSSVTGVPWCFHPLPKQESDQCVMEVSDRRNCG 
I rOlorecUAoKRuUr oNr lr c VP WCr r Piva VEDC 
HY 


3160 


A 


179 


409 


KPKTIOIJCMVYYPI^ 

PKGM^PWKEVNRKKITOETNAASLTPLGSSELRS 
PRISYLHFF 


3161 


A 


683 


1186 


l^STGGIJiAAACAAAMSLVIPEKFQHIlJlVLNTN 

IIXjIUIKIAFAITAIKGVGRRYAHV^ 

KRAGELTl^EVERVITIMQNPRQYKIPDWFLNRQ 

KD\OGDGKYSQVLANGLDNKIJlEDl^RIJCKmA 

HRGLRHFWGLRVRGQHTKTTGRRGRTVGVSKK 

K 


3162 


A 

* 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWRVP 

GRL1XL1XPALCCLPGAARAAAAAAGAGNRAA 

VAVAVARADEAEAPFAGQNWLKSYGYLLPYDS 

RASALHSAKj^QSAVSTMQQFYGIPVTGVLDQT 

T1EWMKXPRCGVPDHPHLSRRRRNKRYALTGQK 

WRQKII]nrYSIHNYTPKVGELDTRKAlE.QAFDV^ 

QKVTPLTraEWYHEIKSDRKEADIlVOTFASGFHG 

DSSPFDGEGGFLAl^YPPGPGIGGDTHFDSDEPW 

TLGNANHDGNDLFLVAVHELGHALGLEHSSDPS 

AIMAPl^QYMETHNfXI^QDDLQGIQKIYGPPAE 

PLEPTRPLPTLPVRRIffiPSERKHERQPRPPRPPLG 

DRPSTPGTKmCIXjOTlsrrVALr^ 

YERAOGRF f„ ;-k3DKYW'VFKEVTVEPG\ ii 1 \hx.Z 

ELGSCU>REGI1)TALRWEPVGKTY1TKGERY^R 

YSEERRATOPGYPKPriVWKGIPQAPQGAnSKiZ 

GYYTYFxTCGRDYWKFDNQKLSVl^GYPRNlLRD 

WMGCNQKEVERRKERRLl^DDVDIIvmT^ 

GSVNAVAVVIPCII^LCILVLVYTIFQFKNKTGPQ 

PVTYYKRPVQEWV 


3163 


A 


1235 


2223 


SRI^LQFYVSFRRTGLrTCK^ 

RTNWVRFQPIHTACACIYIAARALQIPI^^ 

FLLFGTTffiEIQEICIETLRLYTRKKPNYELLEKEV 

EKRKVALQEAKIJCAKGLNPDGTPAl^TLGGFSP 

ASKPSSPREVKAEEKSPISINVKTVKKEPEDRQQA 

bKor YNuv KJCDSKRSKjN oRSASKoRSRTkSRSRS 

ITlTRRITA r NNRRSRSGTY 

rfflNHGSPI^KAKHI^ 

RSQSKSRDHSDAAKKHRHERGHHRDRRERSRSF 
ERSHKSKHHGGSRSGHGRHRR 


3164 


A 


3 


3274 


IX:RLQAAMrTNlTVWVEAHADC^ 
EAPGTPEGPEPERPSPGlXjl^RENSPFLNNVEVE 

TNI^QGVVEHEl^EESRRREAKAPRM 
IJ^QNIIXjVILlTJtLTWWGVAG 
CTCTMLTAISMSAIATNGVVPAGGSYYMISRSLG 
PEFGGAVGLCFYLGTTFAGAMYTLGTIEIFLTYISP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCystelne, P=>Aspartic Add, 
E=GIutamic Add, ^Phenylalanine, G=Grydne, H°Hlstidine, 
I^Isoletidne, K-Lystae, L^Leudne, M^Metblonine, 
N=Asparagjne, P^Proline, Q=CIutamine, RpArginJne, S=Serine, 
^Threonine, V-Valine, W~Tryptophan, Y=»Tyrosine, 
X^Unknown, *=Stop codon, A=possible nudeotide deletion, 
V^possible nudeotide insertion 










GAAIFQAEAAGGEAAAMLHNMRWGTC1LVLM 

ALVVFVGVKYVNKIJUAfFLAC^^ 

KSAFDPPDIPVCLLGNRTLSRRSFDACVKAYGIH 

NNSATSALWGLFCNGSQPSAACDEYFIQNNVTEI 

QGEPGAASGVFLENLWSTYAHAGAFVEKKGVPS 

VPVAEESRASTLPYVLTDIAASFTLLVGIYFPSVT 

GIMAGSNRSGDIJ[DAQKSIPTGTIIAIVTTSFIYLS 

CIVLFGACIEGWLRDKFGEALQGNLVIGMLAW 

PSPWYIVIGSFFSTCGAGLQTLTGAPRLLQAIARD 

GIVPFLQVFGHGKANGEPTWALLLTVLICETGILI 

ASLDSVAPILSMFFLMCYLFVNLACAVQTLLRTP 

NWRPRFKFYHWTLSFLGMSLCIALMHCSWYYA 

I^AMLIAGCIYKYIEYRGAEKEWGDGIRGLSLNA 

ARYAjJLRVEHGPPHTKhTWRPQVLVMLNLDAEQ 

AMKHPRLLSFTSQLKAGKGLTTVGSVLEGTYLD 

KHMEAQRAEENIRSLMSTEKTKGFCQLVVSSSLR 

DGMSHLIQSAGLGGIJCHKIVLMAWPASWKQED 

OTFSWKNFVDTVRDTTAAHQALLVAKNVDSFPQ 

NQERFGGGHIDVWWrVHIXjGMLMLLPFLLRQH 

KVWRKCRMRIFTVAQVDDNSIQMKKDLQMFLY 

HIiUSAEVEVVEMVENDISAFTYERTLMMEQRS 

QMLKQMQLSKNEQEREAQLIHDRNTASHTAAA 

ARTQAPPTPDKVQMTWTREKLIAEKYRSRDTSL 

SGFKDLFSMKPD QSNVRRMHTA VKLNG VVLNK 

SQDAQLVLLNMPGPPKNRQGDENYMEFLEVLTE 

GLNRVLLVRGGGREV1TTYS 


3165 


A 


3 


2681 


GRGARGGSGAGALRGCRGYLQKLSGKGPSRGY 

RSRWFVFDARRCYLYYFKSPQDALPLGHLDIAD 

ACFSYQGPDEAAEPGTEPPAHFQVHSAGAVTVL 

KAPNRQLMTYWLQELQQKRWEYCNSLDMVKW 

Df **TSPTPGDF XGLV ^RDNTDT JYPHP1 ; ' "\EK 

ARiWLAVETVPGELVGEQA J^APGffi'NSlNF 

YSLKQWGNELKNSMSSFRPGRGHNDSRRTVFYT 

NEEWEIXDPTPKDLEESIVQEEKKKLTPEGNKGV 

TGSGFPFDFGRNPYKGKRPLKDIIGSYKNRHSSG 

DPSSEGTSGSGSVSIRKPASEMQLQVQSQQEELE 

QLKKDLSSQKELVRLLQQTVRSSQYDKYFTSSRL 

CEGVPKDTLELLHQKDDQELGLTSQLERFSLEKE 

SLQQEVRTLKSKVGELNEQLGMLMETIQAKDEV 

IIKLSEGEGNGPPPTVAPSSPSWPVARDQLELDR 

IJCDNLQGYKTQNKFLNKEI1^LSAIJ<RNPERRER 

DIMARNSSLEAKLCQEESKYLILLQEMkTPVCSE 

DQGPTREVIAQLLEDALQVESQEQPEQAFVKPHL 

VSEYDIYGFRTVPEDDEEEKLVAKVRALDLKTL 

YLTENQEVSTGVKWENYFASTVNREMMCSPEL 

KNLIRAGIPHEHRSKVWKWCVDRHTRKFKDNTE 

PGHFQTLLQKAI^KQNPASKQIELDLLRTLPNNK 

HYSCPTSEGIQKLRNVLLAFSWRNPDIGYCQGLN 

RLVAVALLYLEQEDAFWCLVTIVEVFMPRDYYT 

KTLLGSQVDQRVFRDLMSEKLPRLHGHFEQYKV 

DYTLITFNWFLVVFVDSWSDILFXIWDSFLYEGP 

KVIFRFALALFKYKEEEELKLQDSMSIFKYLRYFT 

RTDLDARSGTDAPTTWRKSGWS 


3166 


A 


10 


4070 


FPGPTISSNSQLYRASALFETIRHEAQLSTDYKLS 
LFDLQTSSYQALQRVLVSLGHHDEALAVAERGR 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-Alanine OCystdne, D-Aspartk Add, 
E=Glutomic Add, {^Phenylalanine, OGiydne, H^Hlstidine, 
Msokudne, K^Lysine, L^Leudne, M=Methionine, 
N^Asparagine, ^Proline, Q=G1ntamine, R»Arginine, S=Serlne, 
T=Threonlne, V=Valine, W=Tryptophan, Y=Tyro5ine, 
X^ttnknown, *=Stop codon, /possible nodeotide ddetion, 
\=possible nudeotide insertion 










TRAFADIXVERQTGQQDSDF^SPVTIIX}E£MVN 

GQRGLVLYYSLAAGYLYSWIXAPGAGIVKPHEH 

YUjENTVENSSDFQASSSVTLPTATGSALEQHIAS 

VREALGVESHYSRACASSETESEAGDIMDQQFEE 

MNNKLNSVTOPTGrlJfcMVRRN^ 

LFSNTVSPTQDGTSSLPRRQSSFAKPPLRALYDLL 

IAPMEGGLMHSSGPVGRHRQLILVLEGELYLIPF 

ALLKGSSSNEYLYERFGLLAVPSIRSLSVQSKSHL 

RKNPPTYSSSTSMAAVIGNPKLPSAVMDRWLWG 

PMPSAEEEAYMVSELLGCQPLVGSVATKERVMS 

ALTQAECVHFATHISWKLSALVLTPSMDGNPASS 

KSSFGHPYTTPESLRVQDDASDGESISDCPPLQEL 

LLTAADVLDLQLPVKLWLGSSQESNSKVAADG 

\0ALTRAF1AAGAQCVLVSLWPVPVAAFKMFIH 

AFYSSLLNGLKASAALGEAMKWQSSKAFSHPS 

KWAGFMLIGSDVKLNSPSSLIGQALTEILQHPER 

ARDALRVLLHLVEKSLQRIQNGQRNAMYTSQQS 

VENKVGGIPGWQALLTAVGFRLDPPTSGLPAAV 

FFPTSDPGDRLQQCSSILQSLLGLPNPALQALCK 

LrTASETGEQLISRAVKNMVGMLHQVLVQLQAG 

EKEQDLASAPIQVSISVQLWRLPGCHEFLAALGF 

VLCEVGQEEVILKTGKQANRRTVHFALQSLLSLF 

DSTELPKRLSLDSSSSLESLASAQSVSNALPLGYQ 

QPPFSPTGADSIASDAISVYSLSSIASSMSFVSKPE 

GGSEGGGPGGRQDHDRSKNAYLQRSTLPRSQLP 

PQTRPAGNKDEEEYEGFSIISNEPLATYQENRNTC 

FSPDHKQPQPGTAGGMRVSVSSKGSISTPNSPVK 

MTLEPSPNSPFQKVGKLASSDTGESDQSSTETDST 

VKSQEESNPKLDPQELAQKILEETQSHLIAVERLQ 

RSGGQVSKSNNPEDGVQAPSSTAVFRASETSAFS 

RPVI SHQKPO r£?T> YTVKPiTP J>KVSS' -vs- 

SPTTSEMSX j6o?D ^isGRPSPGCDSQTSQLDQPL 

FKUCYPSSP\'SAmSKSPRNMSPSSGHQSPAGSAP 

SPALSYSSAGSARSSPADAPDIDKLKMAAIDEKV 

QAVHNLKMFWQS IT DHSTGPMKIFRGAPGTN1TS 

KRDVLSLLl^PRimKEEGVDKLELKELSLQQH 

DGAPPKAPPNGHWRTETTSLGSLPLPAGPPATAP 

ARPLRIJSGNGYKFLSPGRFFPSSKC 


3167 


A 


1 


762 

• 


AARRRQKGKEENMMMDIJFETGSYFFYLIXjENV 

TLQPLEVAEGSPLYPGSDGTLSPCQDQMPPEAGS 

DSSGEEHVLAPPGLQPPHCPGQCLIWACKTCKRK 

SAPTDRRKAATLRERIUUJCKINEAFEAIJ^ 

NPNQRLPKVEILRSAISYIERLQDLLHRLDQQEK 

MQELGVDPFSYRPKQENLEGADFLRTCSSQWPS 

VSDHSRGLVITAKEGGASIDSSASSSLRCLSSIVDS 

ISSEERKLPCVEEWEK 


3168 


A 


701 


246 


TSRRVTMKFNPFVT[^DRSK>niKRHFNAPSHV 

KIMSSPLSKELRQKYNVRSMPIRKDDEVQVVRG 

HYKGQQIGKVVQVYRKKYVIYIERVQREKANGT 

TVHVGIHPSKWITRIJC^ 

VGKEKGKYKEELIEKMQE 


3169 


A 


156 


3168. 


GPGGAISLSVEAKAGADLLVKGKQARMDIYDTQ 
TLGVVWGGFMVVSAIGIFLVSTFSMKETSYEEA 
LANQRKEMAKTHHQKVEKKKKEKTVEKKGKT 
KKKEEKPNGKIPDHDPAPNVTVLIJREPVRAPAV 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residae of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-Alanine OCysteine, D-Aspartic Add, 
E-GIntamic Add, ^Phenylalanine, OGIydne, B=Hlstidine, 
l=Isoleudne, K-Lysioe, L^Leudne, M-Methlonine, 
N=Asparagine, P^Proline, Q^lutamine, R=Arginine, S=S trine, 
Threonine, V=ValIne, W=»Tryptophao, Y~Tyrosine t 
X-Unknovra, *«Stop codon, A=possibIe nudeotide deletion, 
V=posstble nudeotide insertion 










AVAFIPVQPPIIVAPVATVPAMPQEKLASSPKDK 

KKKEKKVAKVEPAVSSVVNSIQVLTSKAAILETA 

PKEGRNTDVAQSPEAPKQEAPAKKKSGSKKKGP 

PDADGPLYLPYKTLVSTVGSMVFNEGEAQRLIEI 

LSEKAGIIQDTWHKATQKGDPVAILKRQLEBKEK 

IJLATEQEDAAVAKSKLRELNKEMAAEKAKAAA 

GEAKVKKQLVAREQEITAVQARMQASYREHVK 

EV^LQGKIRTLQEQLENGPKTQLARLQQENSIL 

RDALNQATSQVESKQNAELAKLRQELSKVSKEL 

VEKSEAVRQDEQQRKALEAKAAAFEKQVLQLQ 

ASHRESEEALQKJUJDEVSREIXJHTQSSHASLRAD 

AEKAQEQQQQMAELHSKLQSSEAEVRSKCEELS 

GLHGQLQEARAENSQLTERJRSIEALLEAGQARD 

AQDVQASQAEADQQQTRLKELESQVSGLEKEAI 

EIJIEAVEQQKVKNNDLREKNWKAMEALATAEQ 

ACKEKLHSLTQAKEESEKQIX3JEAQTMEALLAL 

LPELSVI^QQNYTEWLQDOCEKGPTLLKHPPAP 

AEPSSDLASKLREAEETQSTLQAECDQYRSILAET 

EGMLRDLQKSVEEEEQVWRAKVGAAEEELQKS 

RVTVKHLEEIVEKIJCGEIJESSIXJVKEHTSHLE^ 

LEKHMAAASAECQNYAKEVAGLRQLLLESQSQL 

DAAKSEAQKQSDELALVRQQLSEMKSHVEDGDI 

AGAPASSPEAPPAEQDPVQLKTQLEWTEADLEDE 

QTQRQKLTAEFEEAQTSACRLQEELEKLRTAGPL 

ESSETEEASQLKERLEKEKKLTSDLGRAATRLQE 

LIXTTQEQlJ^EKDTVKKLQEQLEKAEDGSSSK 

EGTSV 


3170 


A 


6730 


4027 


THASEKYSYGHLPTHSITAHPMVTIRISDRQRLIQ 

PYIHNYSWLLFAALALYSAHLASAEDVDGEKLD 

PQTOSSATTLRSQCMQLVGDCLMKAHQGKGLK 

ALALLGYL?DOPSSIJBDIi£ LPVTVFiXIA SEEQLfi < 

Kl^VQGAEI^EAGNGKRA^/HEBIR?^/.' rk^RNK 

ADKG VSLSKDPSCQTQISDSPADASPP1 GLPDAE 

DSEVSSQKPIEEKAVTPSPEQVFAECSQKRILGLL 

AAJVILPPLKSGPTVPIIDLEHVIJ*^ 

HLNETYrD^TLGLLGQLIIRIXPAEVDAAVIKVLSA 

KHNIJAAGDSSIVPDGWKTT^^ 

RVGLDWACSMAEnJ^LNSAPLWRD\OATFTDH 

CIKQLPFQLKHTNIFTLLVLVGFPQVLCVGTRCY 

YMDNANEPHNVHLKHFTEKNRAVIV^ 

KTVKDYQLVQKGGGQECGDSRAQLSQYSQHFA 

FIASHLLQSSMDSHCPEAVEATWVLSLALKGLY 

KTUCAHGFEEIRATFLQTDLLKLLVKKCSKGTGF 

SKTWLLRDIJEILSIMLYSSKKEINALAEHGDLEL 

DERGDREEEVERPVSSPGDPEQKKLDPLEGLDEP 

TRICFLMAHDALNAPIJnLRAIYELQMKKTDYFF 

LEVQKRFDGDELTTDERIRSLAQRWQPSKSLRLE 

EQSAKAVDTDMELPCLSRPARCDQATAESNPVT 

QKLISSTESELQQSYAKQRRSKSAALLHKELNCK 

SKRAVRDYLFRVNEATAVLYARHVLASLLAEWP 

SHWVSEDILEl^GPAHMTYIIJDMrMQLEEKHE 

WEKWMQTELVLTHQVLPLPHRLPPVSASWSEA 

TCVAVQLPDRCECSKGRVTVSSPKDWASEELRG 

PERDFQLNQKALSPSSQFPSAEILRHIR 


3171 


A 


557 


89 


GTRAGPVKDREAFQRLNFLYQAAHC VL AQDPEN 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Add, 
E=Glutaraic Add, ^Phenylalanine, G=Glydne, H-Histidine, 
I«IsoJeudne, K->Lysine, L=Leucine, MNMethionine, 
N=Asparagine, P*=Proline, Q=C lata mine, R=Arginint, S=Serine, 
•^Threonine, V=Vailne, W=Tryptophan, Y^Tyrosine, 
X-Un known, * t =Stop codon, Impossible nucleotide deletion, 
V=possible nndeotide insertion 










Q ALARFY CYTERTIAKRL VLRRDPS VKRILCRGC 
SSLLVPGLTCTQRQRRCRGQRWTVQTCLTCQRS 
QRFLNDPGHLLWGDRPEAQLGSQADSKPLQPLP 
NTAHSISDRLPEEKMQTQGSSNQ 


3172 


A 


2 


496 


FRRAGAGRGRRRGEVTSPLSPEPLAFQSLATSRR 

PEPQTTQTVRSSALPAPPASPMSQYAPSPDFKRA 

LDSSPEANTEDDKTEEDVPMPKNYLWLTIVSCFC 

PAYPINIVALVFSIMSLNSYNDGDYEGARRLGRN 

AKWVAIASIDGLLnGISCAVHFTRNA 


3173 

• 

1 

i- 


A 


2 


4048 


FRSGGCRRRAWTSRWPQRRRSPESCEAPLSAPL 

WGPQRGLPGREPLRSRSASAIALRTIGHILALLLR 

IXHIX3LGSGGCREDVPPSGRGKKEEKMKKHRRA 

LALVSCLFLCSLVWLPSWRVCCKESSSASASSYY 

SQDDNCALENEDVQFQKKDEREGPINAESLGKS 

GSNIJPISPKEHK1JCDDSIVDVQNTESKKLSPPVVE 

TLPTVDLHEESSNAVVDSETVENISSSSTSEITPIS 

K1JDEIEKSGTIPIAKPSETEQSETDCDVGEALDAS 

APIEQPSFVSPPDSLVGQHIENVSSSHGKGKITKSE 

FESKVSASEQGGGDPKSALNASDNLKNESSDYT 

KPGDIDPTS V A SPKDPEDIPTFDE WKKKVMEVEK 

EKSQSMHASSNGGSHATPaCVQKNRNNYASVEC 

GAKILAAKPEAKSTSAILEENMDLYMLNPCSTKI 

WFVIELCEPIQVKQLDIANYELFSSTPKDFLVSISD 

RYPTNKWIKLGTFHGRDERNVQSFPLDEQMYAK 

YVKMFIKYIKVELLSHFGSEHFCPLSLIRVFGTSM 

VEEYEEIADSQYHSERQELFDEDYDYPLDYNTGE 

DKSSKNLLGSATNAILNMVNIAAN1LGAKTEDLT 

EGNKSISENATATAAPKMPESTPVSTPVPSPEYVT 

TEVHTHDMEPSTPDTPKESPIVQLVQEEEEEASPS 

TVTLLGSGEQEDESSPWFESETQIFCSELTnCCIS 

SFSEV KKW^SV^VAJ ^f™*TALS&'}l'S-Y* V 

LAQPi^LLLPAES VDVS v 2 . vr L- 7£LENl"NIEki3AE 

TVVLGDI^SSMHQDDL\>"i4IYDAVELEPSHSQT 

LSQSLLLDITPEINPLPKIE\ VEYEAGHIPSPVI 

PQESSAnEmNETEQKSESrasn^:Fa.;TYETNK^ 

LMDNIIKEDWSMQIFTKLSETrVPPINTATVPDN 

EDGEAKMNIADTAKQTLISWDSSSLPEVKEEEQ 

SPEDALLRGLQRTATDFYAELQNSTDLGYANGN 

LVHGSNQKESVFMRLNNRKALEVNMSLSGRYL 

EE1JSQRYRKQMEEMQKAFNKTIVKLQNTSRIAE 

EQIXJRQTEAIQLLQAQLTNMTQLVSNLSATVAE 

UGlEVSDRQSYL\nSL\aX^/LGLMLCMQRCRN 

TSQFDGDYISKLPKSNQYPSPKRCFSSYDDMNLK 

RRTSFPLMRSKSLQLTGKEVDFNDLYIVEPLKFSP 

EKKKKRCKYKIEKIETIKPEEPLHPIANGDIKGRK 

PFTNQRJDFSNMGEVYHSSYKGPPSEGSSBTSSQS 

EESYFCGISACTSLCNGQSQKTKTEKRALKRRRS 

KVQDQGKLIKTLIQTKSGSLPSLHDIIKGNKEITV 

GTFGVTAVSGHI 


.3174 


A 


485 


4668 


RKCSKEKASKTPSQKIPTTPCCVLQAGPEPRSLAE 

RMGAIXjETWLKhMLIGVNLDJLGSMIKPSECQL 

EVTTERVQRQS VEEEGGIANYNTSSKEQP VVFNH ' 

VYNINVPLDNLCSSGLEASAEQEVSAEDETLAEY 

MGQTSDHESQVTFTHRJNFPKKACPCASSAQVLQ 

ELLSRJEMLEREVSVLRDQCNANCCQESAATGQL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«AIanine OCysteine, D^Aspartic Add, 
EMilutamic Add, ^Phenylalanine, G=Gr/dne, H=Histidine, 
I=Isoleurine, K=Lyslne, L^Leucine, M-Methioniue, 
, N=A5paragine, P=>Proline, Q=Glutamine, R^Argfnine, S=$erine, 
T=Threonine > V=Valine» W=Tryptophan, Y^Tyrosliie, 
X-Unknown, *=Stop codon, /=posstble'nadeottde deletion, 
V=possibIe nndeotide Insertion 




• - " 






DYDPHCSGHGNFSFESCGCICNEGWFGKNCSEPY 

CPLGCSSRGVCVDGQCICDSEYSGDDCSELRCPT 

DCSSRGLCVDGECVCEEPYTGEDCRELRCPGDCS 

GKGRCANGTCLCEEGYVGEDCGQRQCLNACSG 

RGQCEEGLCVCEEGYQGPDCSAVAPPEDLRVAG 

ISDRSIELEWDGPMAVTEYVISYQPTALGGLQLQ 

QRVPGDWSGVTTTELEPGLTYNISVYAVISNILSL 

PITAKVATHLSTPQGLQFKTITETTVEVQWEPFSF 

SFIXjWEISFIPKNNEGGVIAQVPSDVTSFNQTGLK 

PGEEYIVNVVALKEQARSPPTSASVSTVIDGPTQI 

LVRDVSDTVAFVEWIPPRAKVDFILLKYGLVGGE 

GGRTTFRLQPPLSQYSVQALRPGSRYEVSVSAVR 

GTNESDSATTQFTTEIDAPKNLRVGSRTATSLDL 

EWDNSEAEVQEYKVVYITLAGEQYHEVLVPRGI 

GPTTRATLTDLVPGTEYGVGISAVMNSQQSVPAT 

MNARTELDSPRDLMVTASSETSISL1WTKASGPID 

HYRITFTPS SGIASEVTVPKDRTSYTLTDLEPG AE 

YnSVTAERGRQQSLESTVDAFTGFRPISHLHFSH 

VTSSSVMTWSDPSPPADRLILNYSPRDEEEEMME 

VSLDATKRHAVIJV1GLQPATEYIVNLVAVHGTVT 

SEPIVGSITTGIDPPKDIT1SNVTKDSVMVSWSPPV 

ASFDYYRVSYRPTQVGRLDSSWPNTVTEFTITR 

LNPATEYEISLNSVRGREESERICTLVHTAMDNP 

VDLIATNITPTEA1XQWKAPVGEVENYVIVLTOT 

AVAGEmVDGVSEEFRLWLLPSTHYTATMYAT 

NGPLTSGTISTNFSTLLDPPANLTASEVTRQSALIS 

WQPPRAEIENYVLTYKSTDGSRKELIVDAEDTWI 

RLEGLLENTDYTVLLQAAQDTTWSSITSTAFTTG 

GRVFPHPQDCAQHLMNGDTLSGVYPIFLNGELS 

QKIXJWCDMTTDGGGWTVFQRRQNGQTDFFRK 

WAPYFA'GFONVEDEF'&XGI ^NIHRTTSQGRltT- 

RVl)MRLxiQHAAFASYDiiFS^/ED3:.^ ^VKLRlucy 

YNGTAGDSLSYHQGRPFSTEDRDNDVAVTNCA 

MSYKGAWWYKNCHRTNLNGKYGESRHSQGIN 

WYHWKGHEFSIPFVEMKMRPYNHRLMAGRKRQ 

SLQF 


3175 


A 


2 


623 


RLQLPACPALSAAHPLALPSFSSQCHRAEARAAA 

AATAEGTMASGVTVNDE VIKVFNDMKVRKS ST 

QEEIKKRKKAVLFCLSDDKRQETVEEAKQILVGDI 

GDTVEDPYTSFVKLLPLNDCRYALYDATYETKE 

SKKEDLWIFWAPESAPIJKSKMIYASSKDAIKKK 

FTOIKHEWQVNGLDDIKDRSTLGEKLGGNVVVS 

LEGKPL 


3176 


A 


99 


1567 


PRGCWSSCLDAMFRLNSLSALAELAVGSRWYH 

GGSQPIQIRRRLMMVAFLGASAVTASTGLLWKR 

AHAESPPCVDNLKSDIGDKGKNKDEGDVCNHEK 

KTADLAPHPEEKKKKRSGFRDRKVMEYENRIRA 

YSTPDKIFRYFATLKVISEPGEAEVFMTPEDFVRS 

ITPNEKQPEHLGLDQYIIKRFDGKTEKISQEREKF 

ADEGSIFYTLGECGLISFSDYIFLTTVLSTPQRNFE 

IAFKMFDLNGDGEVDMEEFEQVQSIIRSQTSMG 

MRHRDRPTTGKTTJCSGLCSALTTYFFGADLKGK 

LTIKNFLEFQRKLQHDVLKLEFERHDPVDGR1TE 

RQFGGMLLAYSGVQSKKLTAMQRQLKKHFKEG 

KGLTFQEVENrTTr^KNINDVDTALSFYHMAGAS 
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SEQID 
NO. 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alanlne C=€ysteine, D^Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, OGIydne, B-Histidioe, 
I^lsoleudne, K~Lysine, L=Leudne, M»Mettalonine, 
N»Asparaginc, P^ProUne, QKSIutaminc, R»Arginine, S=Serine» 
T»Threonine, V^Valine, W»Tryptopban, Y^Tyrosine, 
X=UnknowD, ^top codon,/=possible nudeotide deletion, 
\=possible nudeotide insertion 










LDKVTMQQVARTVAKVELSDHVCDVVFALFDC 
DGNGEI^GNKEFVSIMKQRLMRGLEKPKDMGFrR 
LMQAMWKCAQETAWDFALPKQ 


3177 


A 


182 


648 


LGWGSGAAVGGRQAARGAALGRRPMAAVLG 
ALGATRRLLAALRGQSLGLAAMSSGTrlRLTAEE 
RNQAILDLKAAGWSELSERDAIYKEFSFHNFNQA 

fgfmsrvaixjaekmnhhpewfnvynkvqitlts 
hix:geltkkdvklakfiekaaasv 


3178 


A 


8 


612 


acgcrsfcgstvmslllyyalpalgsyamlsiff 

ijirphixhtprapte 7 rirlgahrggsgellentm 

eamensmaqrsdlleldcqltrdrwwshde 

nlx:rqsglnrdvgsldfedlplykeklevyfspg 

hfahgsdrrmvrledlfqrfprtpmsveikgkn 

eelireiaglvrrydrotitiwasekssvmkkck 


3179 


A 


88 


1496 


QETSKMETLSFPRYNVAEIVIHnO^aLTGADGKN 

LIKNDLYPOTKPEVIJIMIYMRALQ 

YMMPVNSEVMYPHLMEGFLPFSNLVTHLDSFLPI 

CRVNDFETADILCPKAKRTSRFLSGIINFIHFREAC 

RETYMEFLWQYKSSADKMQQLNAAHQEALMK 

LERLDSVPVEEQEEFKQLSDGIQELQQSLNQDFH 

QKTIVLQEGNSQKKSNISEKTKRLNELKLSVVSL 

KEIQESIJCTKIVDSPEKIJCNYKEKMKDTVQKLK 

NARQEWEKYEIYGDSVDCLPSCQLEVQLYQKK 

IQDLSDNREKLASILKESLNLEDQIESDESELKKL 

KTEENSFKRLMIVKKEKl^TAQFKINKKHEDVK 

QYKRTVIEDCNKVQEKRGAVYERVTTn^HEIQKI 

RLGIQQLKDAADREKLKSQEIFLNLKTALEKYHD 

GIEKAAEDSYAKIDEKTAELKRKMFKMST 


3180 


A 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLEVA 
WPLFIFLILI5VRLSYPPYEQHECHFPNKAMPSAG 
TLPWVQGIL ~ ; 2I>!PCTRYPTT?7 WGN*1 ZL ' 

/A^SDARRUXYSQKDTiL ^IkLl^ JCVLRTI/ 
C^IKKSSSNLKLQDFLVDNEIFSt: "^YHNLSLPK 
STVDKMIJlADVILHKWLQGYQLilL^SLCNGSK 
SEEMIQIXJIXJEVSEI^GLPREKLAAABIIVLRSN 
MDILKPILRTLNSTSPFPSKELAEATKTLLHSLGT 
LAQELFSMRSWSDMRQEVMFLTNVNSSSSSTQI 
YQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKAL 
FGGNGTEEDAETFYDNSTTPYOTOLMKNLESSPL 
SRIIWKAIJCPLLVGKILYTPDTPATRQVMAEVNK 
TFQELAVFHDI^GMWEELSPKIWTFMENSQEMD 
LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAF 
LAKHPEDVQSSNGSVYTWREAFNETNQAIRTISR 
FMECVNLNKIJBPIATEVWLINKSMELLDERKFW 
AGIVFTGI1PGS1ELPHHVKYKIRMGIDNVERTNK 
IKDGYWDPGPRADPFEDMRYVWGGFAYLQDW 
EQAHRVLTGTEKKTGVYMQQMPYPCYVDDIFLR 
VMSRSMPLFMTLAWIYSVAVIIKGIVYEKEARLK 
ETMRIMGLDNSILWFSWHSSLIPLLVSAGLLVVI 
LKLGl^PYSDPSXn^VIO^WAWTILQCFLIST 
IJ^SRANIAAACGGIIYFTLYLPYVLCVAWQDYV 
GFTLKLFASLLSPVAFGFGCEYFALFEEQGIGVQW 
DNLFESPVEEDGFNLTTSVSMMLFDTFLYGVMT 
WYIEAWPGQYGIPRPWYWCTKSYWFGEESDEK 
SHPGSNQKRISEICMEEEPTHLKLGVSIQNLVKVY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nndeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«AIanine OCystdne, O-Aspartic Add, 
E°Glutamlc Add, F^Phenylalanlne, G=Clydne, HHHistldine, 
polsoleudne, K^Lysine, LNLcudne, M»Metniouine> 
N=Asparagfne, P=Prolioe, Q=Glutamine, R=Argimne, S=Serine, 
T^Threonlne, V=Vallne, W=Tryptophan, Y^Tyrosine, 
X»Unknown, *=Stop codon, /^possible nndeotide deletion, 
V=possibIe nudeotide insertion 






• 




RDGMKVAVDGLALNFYEGQITSFLGHNGAGKTT 
TMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLG 
VCPQHhTVTLFDMLITEEHIWFY'ARIJCGLSEKHVK 
AEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLS 
VALAFVGGSKWILDEPTAGVDPYSRRGIWELLL 
KYR^RTDLSTHHMDEADVLGDRIAnSHGKLCC 
VGSSLFUCNQLGTGYYLTLVKKDVESSLSSCRNS 
SSTVSYLKKEDSVSQSSSDAGLGSDHESDTLTID 
VSAISNLIRKHVSBARLVEDIGHELTYVLPYEAA 
KEGAFVELFHEIDDRLSDLGISSYGISETTLEEIFL 
KVAEESGVDAETSDGTLPARRNRRAFGDKQSCL 
RPFTEDDAADPNDSDIDPESRETDLLSGMDGKGS 
YQVKGWKLTWQFVA1XWKRLLIARRSRKGFF 
AQIVLPAVFVCIALVFSLrVPPFGKYPSLELQPWM 
YNEQYTFVSNDAPEDTGTLELLNALTKDPGFGT 
RCMEGNPIPDTPCQAGEEEWTTAPVPQTIMDLFQ 
NGNWTMQNPSPACQCSSDK3KKMLPVCPPGAGG 
LPPPQRKQNTADILQDLTGKNISDYLVKTYVQIIA 
KSLKNKTWVNEFRYGGFSLGVSNTQALPPSQEV 
NDATKQMKKHIJCLAKDSSADRIT^NSLGRFMTG 
LDTRN>TVTCVWFh^GWHAISSFLNVr>WAlLRA 
NLQKGENPSHYGITAFNHPLNLTKQQLSEVAPM 
TTSVDVLVSICVIFAMSFVPASFVVFLIQERVSKA 
KHLQFISGVKPVIYWLSNFVWDMCNYVVPATLV 
imaCFQQKSYVSSTNLPVLALLLLLYGWSITPLM 
YPASFVFKIPSTA YWLTS VNLFIGING S VATFVL 
ELrTDNKLNNINDDJCSVFLI^ 
KNQAMADALERFGENRFVSPLSWDLVGRNLFA 
MAVEGVWFLITVLIQYRFFIRPRPWAKLSPLND 
EDEDVRRERQRILDGGGQNDIIXIKELTKIYRRK 
RrTPAVDRIi. VGT^GF^FGLLG 1 NG/ WSSTFKM , 
- L7CL' TVTKGDAi LKRNSILSNIHEVHQNMt : 
QFDATTELLTGREHVEFFALLRG VPEKEVGKV GE 
Vv^ffiKLGLVKYGEKYAGNYSGGNKRKLSTAMA 
LIGGFPVVFLDEPTTGMDPKARRFLWNCALSVV 
KEGRSVVLTSHSMEECEALCTRMAIMVNGRFRC 
UJSVQHLKNRFGDGYTIVVRIAGSNPDLKPVQDF 
FGLAFPGSWKEKHRNMLQYQLPSSLSSLARIFSI 
l^QSKKRlJHffiDYSVSQITLDQVFVNFAKDQSDD 
DHLKDLS1JHKNQTVVDVAVLTSFLQDEKVKESY 
V 


3181 


A 


215 


1367 


PPATSQAALPEALSKGRETPRPATHPARSQDVRP 

LSCPFDFLRDNVEWSEEQAAAAERKVQENSIQR 

VCQEKQVDYEINAHKYWNDFYKIHENGFFKDR 

HWIJTEFPEL^SQNQNHUa^WFLENKSEW 

RNNEDGPGLIMEEQHKCSSKSLEHKTQTPPVEEN 

VTQKISDLEICADEFPGSSATYRILEVGCGVGNTV 

FPILQTNNDPGLFVYCCDFSSTA1ELVQTOSEYDP 

SRCFAFVHDLCDEEKSYPVPKGSLDIIILIFVLSA1 

WDKMQKAINRLSRLLKPGGMVLLRDYGRYDM 

AQIJIFKKGQCLSGNFYVRGDGTRVYFFTQEELD 

TLFTTAGLEKVQNLVDRRLQVNRGKQLTMYRV 

WIQCKYCKPLLSSTS 


3182 


A 


3 


1289 


GSETQHLPRDPQHLPWDPQQHQDRRRPELFHAF 
ARDSAPPPSMVLAAETTSQQERLQAIAEKRKRQ 



277 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT7US01/04098 



SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine CCystdne, D»Aspartf c Add, 
&=Glutamic Add, F-Pbeny tela nine, G=Glydne, H»Histidine, 
I»Isoleudnc, K«Lysine, JL°Leudne, M=Methionine, 
N=Asparagine, P=Proline, Q=C lute mint, R=Arginine, S=Serine, 
T^Tbrconine, V=Valjne, W^Tryptopban, Y=*Tyrosine, 
X=Unknown ( *=Stop codon,/=possJble nudeotide deletion, 
V=possible nudeotide insertion 










AEEENKRRQLEDERRQLQHLKSKALRERWLLEG 

TPSSASEGDEDLRRQMQDDEQKTRLLEDSVSRLE 

KGIEVLERGDSAPAAAKENAAAPSPVRAPAPSPA 

KEERKTEWMNSQQTPVGTPKDKRVSNTPLRTV 

DGSPMMKAAMYSVEITVEKDKVTGETRVLSSTT 

LLPRQPLPLGIKVYEDETKWHAVDGTAENGIHP 

LSSSEVDELIHKADEVTLSEAGSTAGAAETRGAV 

egaarttpsrreitgvqaqpgeatsgppgiqpgqe 
ppvtmi™gyqnvedeaeti<xvlglqdtitael 
w1ed aaepkepappng saaeppteaasreenqa 
gpeattsdpqdldmkkhrckccsim 


3183 


A 


333 


1931 


IAPTGGSHSEIQKQLGSGGDSSSQRRAERRTEPRS 

APRPRWGRSARSPGAHKLPGPPRRRDPGAWARL 

EAAAAHRHSRGSMGRRMRGAAATAGLWLLAL 

GSLLALWGGLLPPRTELPASRPPEDRLPRRPARS 

GGPAPAPRFPLPPPLAWDARGGSLKTFRALLTLA 

AGADGPPRQSRSEPRWHVSARQPRPEESAAVHG 

GVFWSRGLEEQVPPGFSEAQAAAWLEAARGAR 

MVALERGGCGRSSNRLARFADGTRACVRYGINP 

EQIQGEALSYYLARLLGLQRHVPPLALARVEAR 

GAQWAQVQEELRAAHWTEGSVVSLTRWLPNLT 

DVWPAPWRSEDGRLRPLRDAGGELANLSQAEL 

VDLVQWTDLILFDYLTANFDRLVSNLFSLQWDP 

RVMQRATSNLHRGPGGALVFLDNEAGLVHGYR 

VAGMWDKYNEPLLQSVCVFRERTARRVLELHR . 

GQDAAARLLRLYRRHEPRFPELAALADPHAQLL 

QRRLDFLAKHILHCKAKYGRRSGDLVSPGGKER 

DLGLGYG 


3184 


A 


1 


1004 


GSTHASADAWAQWFCTEALVMGAPVWYLVAA 

ALLVGFBLFLTRSRGRAASAGQEPLHNEELAGAG 

R 1 'AQPOI* EPEEP; A OGP^RF °RDJ CfF/.r * OR j 

RAQR V VEADfcl ^EAVEAQEE^V^ . ET 

HLSGKlGAKKl^RKLEEKQARKAQREAH^AEREE 

RKRL^QREAEWKXEEERLRLEEEQKEiilE^KA 

REEQAQREHEEYlJKLKEAFVVEEEGVGETlvi.rSB 

QSQSFLTErTNYIKQSKVVLLEDLASQVGLRTQD 

TINRIQDIXAEGTITGVIDDRGKFIYnPEELAAVA 

NFIRQRGRVSIAELAQASNSLIAWGRESPAQAPA 


3185 


A 


2981 


7173 


CLLAGKFSSTLYETGGCDMSLVNFEPAARRASNI 

CDTDSHVSSSTSVRFYPHDVLSLJKJIRLNRLLTID 

TDLLEQQDIDLSPDLAATYGPTEEAAQKVKHYY 

RFWILPQLWIGINFDRLTLLALFDRNREIIJKNVLA 

VEJULVAFLGSI1XIQGFFRDIWWQFCLVIASCQ 

YSLLKSVQPDSSSPRHGHNRIIAYSRPVYFCICCG 

LIWLLDYGSRNLTATKFKLYGITFIWLVFISARD 

LVIVFILCFPrVFnGIXPQVNTFVMYLCEQLDIHI 

FGGNATTSLLAALYSFICSIVAVALLYGLCYGAL 

KDSWDGQHIPVLFSIFCGLLVAVSYHLSRQSSDP 

SVLFSLVQSKIFPKTEEKNPEDPLSEVKDPLPEKL 

RNSVSERLQSDLWCIVIGVLYFAIHVSTVFTVLQ 

PALKYVLYTLVGFVGFVTHYVLPQVRKQLPWH 

CTSHPLLKTLEYNQYEVRNAATMMWFEKLHVW 

LLFVEKNIIYPLIVLKELSSSAETIASPKKLNTELG 

ALMITVAGLKLLRSSFSSPTYQYVTVIFTVLFFKF 

DYEAFSETMLLDLFFMSILFNKLWELLYKLQFVY 



278 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/DS01/04098 



SEQJD 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=AlanJne OCystdne, D"=Aspartie Add, 
E=GIntamic Add, F=Phenytelanine, G=Glydnc, H=Histidine, 
Msolendne, KpLysine, L*=Leudne, M=Methioninc, 
N=Asparagine f P^Proline, Q=Clntamine, R^Arginlne, S=Serine, 
T^Threonine, V=VaItoe, W-Tryptophan, Y-Tyrosine, 
X=Unknown, *-«top codon, ^possible nucleotide ddetion, 
V=possible nudeotide insertion 










TYlAPWQrrWGSAFHAFAQPFAVPHSAMLFIQAA 

VSAFFSTPLNPFLGSAIFITSYVRPVKFWERDYNT 

KRVDHSNTRLASQLDRNPGTYCQQREVEATTEG 

VEEDEGFCCCEPGHIPHMLSFNAAFSQRWLAWE 

VIVTKYILEGYSITDNSAASMLQVFDLRKVLTTY 

YVKGIIYYWTSSKL^EWLANETMQEGLRLCAD 

RNYVDVDPTFNPNIDEDYDHRLAGISRESFCV1Y 

LNWIEYCSSRBAKPVD VDKDSSLVTLCY GLCVL 

GRRALGTASHHMSSNLESFLYGLHALFKGDFRIS 

SERDEWIF ADMELLRKVWPG IRMSIKLHQDHFT 

SPDEYDDPTVLYEAIVSHEKNLVIAHEGDPAWRS 

AVLANSPSLLALRHVMDDGTNEYKIIMLNRRYL 

SFRVIKVNKECVRGLWAGQQQELVFLRNRNPER 

GSIQNAKQALRNMINSSCDQPIGYPIFVSPLTTSY 

SDSHEQLKDILGGPISLGNIRNFIVSTWHRLRKGC 

GAGCNSGGNIEDSDTGGGTSCTGNNATTANNPH 

SNVTQGSIGNPGQGSGTGLHPPVTSYPPTLGTSHS 

SHSVQSGLVRQSPARASVASQSSYCYSSRHSSLR 

MSTTGFVPCRRSSTSQ1SLRNLPSSIQSRLSMVNQ 

MEr^GQSGIACVQHGLPSSSSSSQSIPACKHHTL 

VGFLATEGGQSSATDAQPGNTLSPANNSHSRKA 

EVIYRVQIVDPSQILEGINLSKRKELQWPDEGIRL 

KAGRNSWKDWSPQEGMEGHVIHRWVPCSRDPG 

TRSHIDKAVLLVQIDDKYVTVIETGVLELGAEV 


3186 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

IXiVIKEVNVSPCr^QPCQLSKGQSYSVNWFTSN 

IQSKSSKAWHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLQDD 

KNQSLFCWEIPVQIVSHL 


3187 


A 


3 


470 


sls amrflaatfi llalstaaq aepv qfkdcg s v 

dgytk; t"- /spcf tqic^ t c ^gqsys v . kvtf^n 

IQ&^-sm^y HGILMGVTVPFPIPEPDGCjCSGINC 
PIQKjl>\ TYSYimLPVKSEYPSIKLVVEWQLQDD 
KNQSLFC^TSIPVQIVSHL 


3188 


A 


2 


3483 


PRVRTKLILLVrlDKKRYERVGGGPKRLGRDVEM 

EEMIEQLQEKVHELEKQNDTLKNRLISAKQQLQT 

(^YRQTPYNNVQSRINTGRRKANENAGLQECPR 

KGIKFQDADVAETPHPMFTKYGNSLLEEARGEIR 

NLENVIQSQRGQEEELEHLAEILKTQLRRKENEIE 

LSLLQLREQQATIXJRSN1RDNVEMIKLHKQLVE 

KSNALSAMEGKHQLQEKQRTLKISHDALMANG 

DELKMQLKEQRLKCCSLEKQLHSMKFSERRIEEL 

QDRINDLEKERELLKENYDKLYDSAFSAAHEEQ 

WKLKEQQLKVQIAQLETALKSDLTDKTEELDRL 

KTERDQNEKLVQENRELQLQYLEQKQQLDELKK 

RIKLYNQENDINADELSEALLLIKAQKEQKNGDL 

SFLVKVDSEINKDLERSMRELQATHAETVQELEK 

TRNMLIMQHKINKDYQMEVEAVTRJ^ 

YELKVEQYVffl-UDniAARIHKLEAQlJCDIAYGTK 

QYKFKPEIMPDDSVDEFDETIHLERGENLFEIHIN 

KVTFSSEVLQASGDKEP VTFCTY AFYDFELQTTP 

\^GIIIFEYNFreQYLVHV^ 

EVHQAYSTEYETIAACQLKFHEILEKSGRIFCTAS 

LIGTKGDIPOTGTVEYWFRLRVPMDQAIRLYRER 

AKALGYITSNFKGPEHMQSLSQQAPKTAQLSSTD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A» Ala nine OCysteine, D=Aspartic Add, 
E=Glntamic Add, F-Pbenylalanine, G=Gtydne, H°Histidint, 
JNteoleudne, K=Lysinc, L^Leudne, M=Methk>niuc, 
N=Asparagine, P=Proline, Q=Glutamint, R=Arginine, S=Scrfne, 
I^Threonine, V=Valine, W^Tryptophao, Y=Tyrosine, 
X=Unknown, *«Stop codon, £*possible nudeotlde deletion, 
V=possible nudeotlde insertion 










STDGNLNELHITIRCCNHLQSRASHLQPHPYWY 

KFTOFADHDTAIIPSSNDPQFDDHMYFPVPMNM 

DLDRYIJCSESLSFYVFDDSDTQENIYIGKVNVPLI 

SLAHDRQSGIFELTDHQKHPAGT1HVILKWKFA 

YLPPSGSITTEDLGNPIRSEEPEVVQRLPPASSVST 

LV1APRPKPRQRLTPVDKKVSFVDIMPHQSDVSQ 

EGSVDEVKENTEKMQQGKDDVSLLSEGQLAEQS 

LASSEDETEITEDLEPEVEEDMSASDSDDCIIPGPI 

SKN1KQPSEKIRIEIIALSLWSQVTMDDTIQRLFV 

ECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIY 

VDKENNKAK^I1JKAILQKQEMP>^LRFTW^ 

DPPEDEQDLECEDIGVAHVDLADMFQEGRDL1E 

QNIDVFDARADGEGIGKIJR.VTVEALHALQSV^ 

QYRDDLEA 


3189 


A 


476 


1175 


MKGSGWHLRSGMVGTLIITILPHWRRTAHVGTN 

ILTAVSYLKGLWMECVWHSTGIYQCQIYRSLLA 

LPQDLQAARALMGISCLLSGIACACAVIGMKCTR 

CAKGTPAKTTFAILGGTLFILAGLLCMGAVSWTT 

NDWQNFYNPLLPSGMKFEIGQALYLGFISSSLSL 

IGGTLLCI^CQDEAPYRPYQAPPRATTTTANTAP 

A YQPPAA YKDNRAPS VTS ATH SG YRLND YV 


3190 


A 


267 


1037 


DRMAWQGLVLAACLLMFPSTTADCLSRCSLCA 

VKTQDGPKPINPLICSLQCQAALLPSEEWERCQSF 

LSFFTPSTLGLNDKEDLGSKSVGEGPYSELAKLS 

GSFLJKELEKSKFI^SISIXENTLSKSLEEKLRGLS 

DGFREGAESELMRDAQLNDGAMETGTLYLAEE 

DPKEQVKRYGGFLRKYPKRSSBVAGEGDGDSM 

GHEDLYKRYGGFLRRIRPKLKWDNQKRYGGFLR 

RQFKWTRSQEDPNAYSGELFDA 


3191 


A 


29 


574 


GTSAGAQTKGALCQLKVPTEKLPSPLPTM ADEID 

FrrOrAGASST^'PMCCS/iLR^^FVVlJCCrj'rK 

IVEMSTSKTGKHGHAfcVHL\ ; ; ^ ; FTGKK YEDIC 

PSTHNMDVPNIKRNDYQLICIQDGYLSLLTETGE 

VREDLKLPEGELGKEIEGKYNAGEDVQVSVMCA 

MSEEYAVAIKPCK 


3192 


A 


105 


1661 


KVSADGMQSCESSGDSADDPLSRGLRRRGQPRV 

VVIGAGLAGIJVAAKALLEQGFTDVTVLEASSHIG 

GRVQSVKLGHATFELGATWIHGSHGNPIYHLTE 

ANGIXEETTDGERSVGRISLYSKNGVACYLTNH 

GRRIPKDVVEEFSDLYNEVYNLTQEFFRHDKPVN 

AESQNSVGVFTREEVRNRIRNDPDDPEATKRLKL 

AMIQQYLKVESCESSSHSMDEVSLSAFGEWTEIP 

GAHHIIPSGFMRVVELLAEGIPAHVIQLGKPVRCI 

HWDQASARPRGPEIEPRGEGDHNHDTGEGGQGG 

EEPRGGRWDEDEQWSVWECEDCELIPADHVIV 

TVSLGVLKRQYTSrTRPGLPTEKVAAIHRLGIGTT 

DKIFIJEFEEPrWGPECNSLQFVWEDEAESHTLTY 

PPELWYRKICGFDVLYPPERYGHVLSGWICGEEA 

LVMEKCDDEAVAEICTEMLRQFTGNPNIPKPRRI 

IJRSAWGSOTYFRGSYSYTQVGSSGADVEKLAKP 

LPYTES SKTATK 


3193 


A 


1 


1928 


QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQT 
AI^SVWKDSNSTTPLIFVLSPGTDPAADLYKFA 
EEMKFSKKLSAISLGQGQGPRAEAMMRSSBERGK 
WVFFQNCHLAPSWMPALERLIEHINPDKVHRDF 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanlne OCysteine, P»Aspartic Add, 
E-GIutamlc Acid, F«Pbeny1alanine, G^GIydne, B=Histidine, 
f-Isolcudne, Kplysine, L=Leucijie, M^Methionlne, 
N=Asparagine, P=ProHne, Q=Glatamioe, RpArginine, S=Serine, 
*P=»Threonine» V=»Vaiine, W=Tryptophan, Y=TyrosiDe, 
X=Unknown, *=Stop codon, A=possible nudeotide deletion, 
V*possible nudeotide insertion 










RLWLTSLPSNKFPVSILQNGSKMTIEPPRGVRAN 

LLKSYSSLGEDFLNSCHKVMEFKSLLLSLCLFHG 

NALERROGPLGF>HPYEFTDGDLRICISQLKMFL 

DEYDDIPYKVLKYTAGEINYGGRVTODWDRRCI 

MNIIJBDFYNPDVLSPEHSYSASGIYHQIPPTYDLH 

GYI^YIKSLPLNDMPEIFGlJIDNANriTAQNETFA 

LLGTnQLQPKSSSAGSQGREEIVEDWQNIIXKVP 

EPINLQWVMAKYPVLYEESMNTVLVQEVIRYNR 

1JLQV1TQTLQDLLKALKGLVVMSSQLEL3V1AASL 

Y>HSrrVPELWSAKAYPSLKPLSSWVMDLLQRLDF 

LQAWIQDGIPAWWISGFFFPQAFLTGTLQNFAR 

KFVISIDTISFDFKVMFEAPSELTQRPQVGCYIrIG 

LFLEGARWDPEAFQLAESQPKELYTEMAVIWLL 

PTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHST 

>TW1AVEIPTHQPQRHWIKRGVALICALDY 


3194 


A 


1 


1023 


DGWTPVHAAVDTGWVDSLKIJLMYHRIPAH 

FNEEESESSVFDLDGGEESPEGISKPWPADLINH 

ANREG WTAAHIAASKGFKNCLEILCRHGGLEPE 

RRDKCNRTVHDVATDDCKHLLENLNALKJPLRIS 

VGEIEPSNYGSDDLECENTICALNIRKQTSWDDFS 

KAVSQALTNHFQAISSDGWWSLEDVTCNNTTDS 

NIGLSARSIRSITLGNVPWSVGQSFAQSPWDFMR 

KNKAEHITVLLSGPQEGCLSSVTYASMIPLQMM 

QNYLRLVEQYHNVIFHGPEGSLQDYIVHQLALCL 

KHRQMGWQDSPVEIVEELEVGCWFFPREQLLRT 

CSLVA 


3195 


A 


1 

- 


1809 


MAASAQVSVTFEDVAVTFTQBEWGQLDAAQRT 

LYQEVMLETCGLLMSLGCPLFKPELIYQLDHRQE 

LWMATKDLSQSSYPGDNTKPKTTEPTFSHLALPE 

EVLLQEQLTQGASKNSQLGQSKDQDGPSEMQEV 

FLnGIGPQ^CI^ T JBKM^EliPOT ^SDDGVCTO 

TQKQVSi ;V: j£iL 1 r- .'DSHGPVl"i>ALIREEKNSYiC 

CEECGKVFiCT 1NAIXVQHERIHTQVKPYECTECG 

KTFSKSTHLLQHI TIHTGEKPYKCMECGKAFNRR 

SHLTRHQRfflSGm^P\XCSECGKAFraRSIPVLH 

HRSHTGEKPFVCKECGKAFRDRPGFIRHYIIHTGE 

KPYECIECIECGKAFNRRSYLTWHQQIHTGVKPF 

ECNECGKAFCESADUQHYIIHTGEKPYKCMECG 

KAFNRRSHLKQHQRIHTGEKPYECSECGKAFTH 

CSTFVLHKR1HTGBKPYECKECGKAFSDRADLIR 

HFSIHTGEKPYECVECGKAFNRSSHLTRHQQIHT 

GEKPYECIQCGKAFCRSANLIRHSIIHTGEKPYEC 

SEXXiKAI^GSSLTHHQRIHTGRNPTIVTOVGRP 

FMTAQTSVNIQELLIXjKEFLNITTEENLW 


3196 


A 


1400 


264 


VGrlVERPl^SSRWFRRSlJUlWEMIj^RAARGTG 

ALLLRGSlXASGRAPRRASSGLPR>rrVVLFVPQQ 

EAWVVERMGRFHRIIJEPGLNILIPVLDRIRYVQSL 

KEIVINVPEQSAVTI£>NVTLQ 

KASYGVEDPEYAVTQLAQTTMRSELGKLSLDKV 

FRERESl^ASIVDAINQAADCWGIRCLRYEIKDIH 

VPPRVKESMQMQVEAERRKRATVLESEGTRESA 

INVAEGKKQAQILASEAEKAEQINQAAGEASAVL 

AKAKAKAEAIRILAAALTQHNGDAAASLTVAEQ 

YVSAFSKLAKDSNTILXPSNPGDVTSMVAQAMG 

VYGALTKAPVPGTPDSLSSGSSRDVQGTDASLDE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=AIanine OCysteine, D-Aspartic Add, 
E=Glutamie Add, F«Pbenyialanlne, OGIydne, H»Hbtidine, 
I=*Isoleudne, K=Lysine, L^Leudoe, M«Mettiionine, 
N^Asparaglne, F^ProIine, Q=Glutamlne, R=Arglnliie, S^Serine, 
T^Threonlne, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nudeotide insertion 










ELDRVKMS 


3197 


A 


66 


3632 


LWECAAAAAGQRIXjGVTLFLKGRVLGRRCAAS 

LFAREVCVSTSSSRPACFLHCARARGEQMHQMA 

SGVGSMKRSPRKMWRPGEKKEPQGVVYEDVRD 

DTEDFKEPLKVVFEGSAYGLQNFNKQKKLKTCD 

DMDTFFLHYAAAEGQIELMEKITRDSSLEVLHE 

MDDYGNTPLHCAVEKNQIESVKFLLSRGANPNL 

RNFNMMAPLHIAV(^MN^VMBwVLLEHRTIDV 

NLEGENGNTA VnACTTNNSEALQlLLNKG AKPC 

KSNKWGCFPIHQAAFSGSKECMEIILRFGEEHGY 

SRQOBNFMNNGKATPLrlLAVQNGDLEMIKMCL 

DNGAQ1DPVEKGRCTAIHFAATQGATEIVKLMIS 

SYSGSVDIVNTTD^HETMLHRASLFDHHELAD 

YLISVGADINKTOSEGRSPLILATASASWNIVNLL 

l^KGAQVDIKDNFGRNFLHLWQQPYGUCNLRP 

EFMQMQQIKELVMDEDNDGCrPLHYACRQGGP 

GSVNNLLGFKVSIHSKSKDKKSPLHFAASYGRIN 

TCQRLLQDISDTRLLNEGDLHGMTPLHLAAKNG 

HDKWQLLLKKGALFLSDHNGWTALHHASMGG 

YIXJTMKVILDT^KCTDRLDEDGNTALHFAARE 

GHAKAVALULSHNADIVLNKQQASFLHLALHNK 

RKEVVLTURSKRWDECUOFSHNSPGNKCPnTM 

ffiYLPECMKVLLDFCMLHSTEDKSCRDYYIEYNF 

KYLQCPLEFTKKTPTQDVIYEPLTALNAMVQNN 

RIELLNHPVCKEYLLMKWI^YGFRAHMMNLGS 

YCLGLIPMTILVVNIKPGMAFNSTGIINETSDHSEI 

LDTTNSYLIKTCMILVFLSSIFGYCKEAGQIFQQK 

RNYFMDISNVLEWnYTTGIIFVLPLFVEIPAHLQ 

WQCGAIAVm^YWMNFLLYLQRFENCGIFIVMLE 

VIUCTLLRSTVWIFLLLAFGLSFYILLNLQDPFSS 

PLLSnQTT^^r.GDINYRESFLE' \'\ RNFIA* W 

LSFAQLVSFTIFVPR 'l , -rXUGLAVGDIAEVQKH 

ASLKRIAMQVELHTSLEKKLPLWFLRKVDQKST1 

VYPNKPRSGGMLFfflFCFLFCTGEIRQEIPNADKS 

IJEMEHJKQKYRLKDLT^ 

SETEDDDSHCSFQDRFKKEQMEQRNSRWNTVLR 

AVKAKTHHLEP 


3198 


A 


51 


2177 


KEKSLHHVDQRPPLWHPGRPGTSQSAAMNASSE 

GESFAGSVQIPGGTTVLVELTPDMCGICKQQFN 

NLDAFVAHKQSGCQLTGTSAAAPSTVQFVSEET 

WATQTQTTTRTITSETQTIWSAPEFVFEHGYQT 

YLPTESNENQTATVISLPAKSRTKKPTTPPAQKRL 

NCCYPGCQFKTAYGMKDMERHLKIHTGDKPHK 

CEVCGKCFSRKDK1JCTHMRCHTGVKPYKCKTC 

DYAAADSSSLNKHLRIHSDERPFKCQICPYASRN 

SSQLTVHLRSHTGDAPFQCWLCSAKFKISSDLKR 

HMRVHSGEKPFKCEFCNVRCTMKGNLKSHIRIK 

HSGNNFKCPHCAFLGDSKATLRKHSRVHQSEHR 

EKCSECSYSCSSKAALRIHERraCTVRPFKCNYCS 

FDSKQPSM.SKHMKKFHGDMVKTEALERKDTG 

RQSSRQVAKLDAKKSFHCDICDASFMREDSLRS 

HKRQHSEYNESKNSDVTVLQFQIDPSKQPATPLT 

VGHLQVPLQPSQVPQFSEGRVKUVGHQVPQANT 

IVQAAAAAVNIVPPALVAQNPEELPGNSRLQILR 

QVSLIAPPQSSRCPSEAGAMTQPAVLLTTHEQTD 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

&cid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 

sequence 


Amino add sequence (A^AIanine OCysteine, r>=Aspartie Add, 
E-Glntamlc Add, ^-Phenylalanine, OGtydne, H^Histidine, 
JNIsoIeudne, K=Lysinc, L*»Leudne, MHtfethiooine, 
N^Asparaglne, P^ProIine, Q=€lutamine, RpArgtnine, &=Serinc, 
T=Threonine, V~Vallne, W^Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon,£=posslDJe nadeotide deletion, 
V=possible nudcobde insertion 










GATLHQTLIPTASGGPQEGSGNQTFITSSGITCTD 
FEG04ALIQEGTAEVTVVSDGGQN1AVATTAPPV 
FSSSSQQELPKQTYSnQGAAHPALLCPADSIPD 


3199 


A 


13 


2247 


QSFHSMEGDPSGLPLLARGASCYSLICPCPRPAD 

WSILQGTDWSDLQSADWCIYNPLARHRALTGVFL 

QSADWCTYNPLARQKSSPSPHSTQEVQLASPLTR 

RFNKKDSAERNHRPAREGSVAQRQPNPAALEKA 

EPAARKRNEREGGGSQEPGREHSLEKGYWAPGL 

GPDPSMCSKQVDPSEGASSHLKHRGGSRAAHLE 

VRRLLRRLVGALVAEAGFCYVQVAEGQRWGV 

LEVAEAAAAPVQHEPTAAVATQSRWFPRGTRPG 

LCSLPIAVAALLCPGSGPGAQSGLEFVERPPPSPL 

AWLARWPLPPPAGRCPRDAPEARVPEKARAEG 

SERENNYGCGWGGEMTTLVLDNGAYNAKIGY 

SHENVSVIPNCQFRSKTARLKTFTANQIDEIKDPS 

GLFYILPFQKGYLVNWDVQRQVWDYLFGKEMY 

QVDFLDTNHITEPYFNFTSIQESMNEILFEEYQFQ 

AVLRVNAGALSAHRYFRDNPSELCCHVDSGYSF 

THIVPYCRSKKKKEAinUNVGGKLLTNHLKEn 

RQ1JIVMDETHVINQVKEDVCYVSQDFYRDMDI 

AKJLKGEENTVMIDYVLPDFSTIKKGFCKPREEMV 

LSGKYKSGEQILRLANERFAVPEILFNPSDIGIQE 

MGIPEAIVYSIQNLPEEMQPHFFKNIVLTGGNSLF 

PGFRDRVYSEVRCLTPTDYDVSWLPENPITYAW 

EGGKLISENDDFEDMWTREDYEENGHSVCEEK 

FDI 


3200 


A 


3 


307 


AVQRIRHEMNIFRLTGDLSHLAAIVILLLKIWKTR 

SCAGISGKSQLLFALVFTTRYLDLFTSFISLYNTS 

MKVWYAIHRNWHLQCTGLWTLNLCQLCIFN 


3201 

v. 


A 


1 


469 


IRHEGRGQRGKMELVQVUCRGLQQITGHGGLRG 

YiJlVi*FP'n^AKVGTLVGF^KYG>O f !rYYEDNIvV 

FFGRHRWVVYTTEMNGK^iTWL>^ 

WHRWLHSMTDDPPTTKPLTARKFIWTOHKF 

GTPEQYVPYSTTRKKIQEW1PPSTPYK 


3202 


A 


144 


840 


NSSQRIMATHALEIAGLFLGGVGMVGTVAVTVM 

PQWRVSAFIENNIVVFENFWEGLWMNCVRQANI 

RMQCKIYDSLLALSPDLQAARGLMCAASVMSFL 

AFMMAILGMKCTOCHXJDN^ 

TGMVV1JPVSWVANAIIRDFYNSIVNVAQKRELG 

EAL YLGWTTALVLIVGG ALF CCVF CCNEKSS SYR 

YSIPSHRTTQKSYHTGKKSPSVYSRSQYV 


3203 


A 


2 


473 


KYRYRRPYPVMRKICQVGPAGLAFILNISPVAHR 
VALCHLAGCQEQAAWYHTLQILFFLVSAYFFSCP 
WEKYFPGSCDIVGHGHQIFHAFLSICTLSQLEAIL 
IJDYQGRQEIFLQRHGPLSVHMACI^FFFLAACSA 
ATAALLRHKVKARLTKKDS 


3204 


A 


1808 


668 


PESAPLPAFISSRILPAAWRNWCSYWTRTISCHV 

QNGTYLQRXOXJNCPWPMSCPGSSYRTVVRFTYK 

VMYKIVTAREWRCCPGHSRVSCEEVAGSSASLE 

PMWSGSTMRRMALRPTAFSGCLNCSKVSELTER 

LKVLEAKMTMLTVIEQPVPPTPATPEDPAPLWGP 

PPAQGSPGDGGLQDQVGAWGLPGPTGPKGDAG 

SRGPMGMRGPPGDPLLSNTFTETNNHWPQGPTG 

PPGPPGPMGPPGPPGPTGVPGSPGHIGPPGPTGPK 

GISGHPGEKGERGLRGEPGPQGSAGQRGEPGPKG 
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SEQID 
NO: 


Method 


Predicted 
begin Din g 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-MJanlne OCystelne, D-Aspartic Add, 
E^GIntamic Add, ^Phenylalanine, G=Clytine, U-4ibtidine, 
I»Isolendne, K=Lysine, LRLendne, M-Methionine, 
N^Asparagine, P=Proline, Q=Glatamine, R=Arginine, S^Serioe, 
^Threonine, V=Valine, W~Tryptopban, Y^Tyrosine, 
X=4Jnknown, *«Stop codon, A=Tpossible nucleotide deletion, 
\=possible nudeotide insertion 










DPGEKSHWGEGLHQLREALKILAERVLILETN4IG 
LYEPELGSGAGPAGTGTPSLLRGKRGGHATNYRI 
VAPRSRDERG 


3205 


A 


2810 


1652 


RTSTQKWQSVF^SQEHLERFY(^Ehn3RMRM 

KYGGQEFWADLNAMNVYETTEFDQLRRLSTPPS 

SNVNSIYHTVTVKFFCRDHTC 

ANSRGLKEVRFMMWNNHmHNSFFRREIK^ 

LFRSCFILLPYLQTLGGVPTQAPPPLEATSSSQIICP 

DGVTSANFYPETWVYMHPSQDFIQVPVSAEDKS 

yiqiynlfhktvpefkyrilqilrvqnqfLweky 

KRKKEYMhniKMFGRDRIINERHIJ^GTSQDVVD 
GICKHNFDPRVCGKHATMFGQGSYFAKKASYSH 
NFSKKSSKGVHFMFlJyCVLTGRYTMGSHGMRR 
PPP\WGSVTSDLYDSC^ODNFFEPQIFVTF^DQS 
YPYFVIQYEEVSNTVSI 


3206 


A 


297 


4500 


CLVDSKLWKGARSVYHQLFMSSLLMDLKYKKL 

FAVRFAKNYERLQSDYVTDDHDREFSVADLSVQ 

IFTVPSLARMLITEENLMSIIIKTFMDHLRHRDAQ 

GRFQFERYTALQAFKFRRVQSULDLKYVUSKPT 

EWSDELRQKFLEGFDAFLELLKCMQGMDPITRQ 

VGQHIEMEPEWEAAFTLQMKLTHVISMMQDWC 

ASDEKVLffiAYKKCLAVLMQCHGGYTDGEQPIT 

LSICGHSVETOYCVSQEKVSIHLPVSRLLAGLHV 

LLSKSEVAYKFPELLPLSELSPPMLIEHPLRCLVL 

CAQVHAGMWRRNGFSLVNQIYYYHNVKCRRE 

MFDKDVVMLQTGVSMMDPNHFLMIMLSRFELY 

QIFSTPD YGKRFS SEITHKD VVQQNNTLIEEML YL 

IIMLVGERFSPGVGQVNATDEIKREI1HQLSIKPM 

AHSELVK5LPEDENKETGMESVIEAVAKrTCKPGL 

TGRGMYELKPECAKEFNLYFYHFSRAEQSKAEE 

/ CWIK)^ ' ONRE^AXTFVLFA TCH F**T,VNILQ 

SDVMLCIMG11., /vvAVi^INGYAWSESMLQRVL 

HLIGMALQEEKQH; .ENVTEEHVVTFTFTQK1SKP 

GEAPKNSPSBLAMLiiTL ONAPYLEVHKDMIRWIL 

KTFNA VKKMRESSPTSI' V AETEGTIMEESSRDKD 

KAERKRKAEIARLRREKIMAQMSEMQRHFIDEN 

KELF(^TIJEIJDASTSAV1JDHSPVASDMTLTALGP 

A(yiXJVPEQRQFVTCnXQEEQEVK\^SRAMVLA 

AFVQRSTVLSKNRSKFIQDPEKYDPLFMHPDLSC 

GTHTSSCGHIMHAHCWQRYFDSVQAKEQRRQQ 

RLRlimYDVENGEFLCPLCECLSNTVIPLLLPPR 

NIFhWRIJ^SrXJPNLTQWIRTISQQIKALQFLRKE 

ESTPNNASIKNSENVDELQLPEGFRPDFRPKIPYS 

ESIKEMLTTFGTATYKVGLKVHPNEEDPRVPIMC 

WGSCAYTIQSIERILSDEDKPLFGPLPCRLDDCLR 

SLTRFAAAHWTVASVSWQGHFCKPFASLVPND 

SHEELPCILDIDMFHLLVGLVLAFPALQCQDFSGI 

SIXJTGDLfflFHLVTMAHnQILLTSCEEENGMDQE 

NPPCEEESAVLALYKTLHQYTGSALKEEPSGWHL 

WRSVRAGIMPFLKCSALFFHYLNGVPSPPDIQVP 

GTSHFEHLX^YLSLPNNUCLFQENSEIMNSLIES 

WCRNSEVKRYLEGERDAIRYPRESNKLINLPEDY 

SSLINQASNFSCPKSGGDKSRAFILCLVCGSLLCS 

QSYCCQTELEGEDVGACTAHTY SCG SG VGIFLR 

VRECQVLFLAGKTKGCFYSPPYLDDYGETDQGL 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^AIanine OCy striae, D=Aspartie Add, 
E=Glutamic Acid, ^Phenylalanine, G=Gfydne, H=Histid]ne, 
I^Isoleudne, K=Lysine> L^Leudne, M=Methioniae, 
N=Asparagine, P^Proline, Q=Giutamine, R^Arglnine, S=Serine, 
T^Threonine, V=Valine, W-Tryptopnan, V^Tyroslne, 
X-Unknown, *«Stop codon, /-"possible nudeotide deletion, 
\= possible nudeotide insertion 










RRGNPLHLCKERFKKIQKLWHQHSVTBEIGHAQ 
EANQTLVG1DWQHL 


3207 


A 


49 


963 


QLSPSQAPAGAQEVARRVTVGSASHGGRRSTMA 

TTVSTQRGPVYIGELPQDFUOtPTQQQRQVQLD 

AQAAC^LQYGGAVGTVGRLNITVVQAKLAKNY 

GMTRMDPYCRLRLGYAVYETPTAHNGAKNPRW 

HKVmCTVPPGVDSFYLEIFDERAFSMDDRIAWT 

H1T1PESLRQGKVEDKW Y SLbuKi^UDKxSUMlNL 

VMSYALLPAAMVMPPQPVVLMPTVYQQGVGY 

VPITGMPAVCSPGMVPVALPPAAVNAQPRCSEE 

DUCAIQDMFPNMDQEVIRSVLEAQRGNKDAAIN 

SLLQMGEEP 


3208 


A 


54 


1196 


LERTPASADMAWTKYQLFIAGLMLVTGSINTLS 

AKWADNFMAEGCGGSKEHSFQHPFLQAVGMFL 

GEFSCLAAFYLLRCRAAGQSDSSVDPQQPFNPLL 

FLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 

VHFTGLFSVAFLGRRLVLSQWLGILATIAGLVVV 

GLADLXSKHDSQHKLSBVTTGDLUIMAQnVAIQ 

MVLEEKFVYKHhTVHPLRAVGTEGLFGFVILSLLL 

VPMYYIPAGSFSGNPRGII^EDAIJJAFCQVGQQP 

LUVALLGlsnSSIAIWFAGISVTKELSATTRMVL 

DSLRTVVIWALSLALGWEAFHALQILGFLILLIGT 

ALYNGLHRPLLGRLSRGRPLAEESEQERLLGGTR 

TPINDAS 


3209 

\ 


A 

■, 


104 


1999 


AKWSLKEFSCFWRREKPVSSLSSLQVKAEASW 
DSAVHGCPQLSRGTPVDERLFLIVRVTVQLSHPA 
DMQLVLRKRICVNVHGRQGFAQSLLKKMSHRSS 
IPGCGVTFEIVSNIPEDAQGVEEREALARMAANV 
ENPASADSEAYIEKYLRSVLAVENLLTLDRLRQE 
VAVKEQLTGKGKLSRRSISSPNVNRLSGSRQDLIP 
,y 5J>C3NKGRWESQOPV5iQTTrVSRGIAPA?AL 
SFQNNKePDPGI^NI^m^ 
KSLFPVRDEKJlGKRPSPLAHQPVPRxivfVQSASPDI 
RVTRMEBAQPEMGPDVLVQTMGAPALKICDKP 
AKVPSPPPVIAVTAVTPAPEAQDGPPSPLSEASSG 
YFSHSVSTATLSDALGPGLDAAAPPGSMPTAPEA 
EPEAPISHPPPPTAVPAEEPPGPQQLVSPGRERPDL 
EAPAPGSPFRVRRVRASELRSFSRMLAGDPGCSP 

f-\ a T?/^\T ADA T>/~1 A ATA PT\0 T3T7 A nC\ /TJlOYl 7T T>r?/~« 

vj AECj N Ar Ar U Auuv^ Al^SDb iiliAi>ii Vr ii W LRxSu 
EFVTVGAHKTGVVRYVGPADFQEGTWVGVELD 
LPSGKNIXjSIGGKQYFRCNPGYGLLVRPSRVRR 
ATGPVRRRSTGLRLGAPEARRSATLSGSATNLAS 

Li 1 PU\\sfSJSJ\lJs\S5 xUSoN rMifN xsJVo WAa 


3210 


A 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALVVS 
GGIVGYVKTGSVPSLAAGLLFGSLAGLGAYQLY 
QDPRNVWGFLAATSVTFVGVMGMRSYYYGKF 
MPVGLIAGASLLMAAKVGVRMLMTSD 




A 


1078 




VuMliLr A V NLK V IxAAja. wLL T I vrULl Vr SOS Y A 
WANFTILALGVWAVAQRDSIDAISMFLGGLLATI 
FLDIVHISIFYPRVSLTDTGRFGVGMAILSLLLKPL 
SrrFVYHVfYRFRrrGPT T VHTfiFT rr^snryRSAYo 

ouur v i xxlva i jvDiv\j vj r> i / 1 < v xx x vji/ x^vjoov^xsivo/i. x 

1TDSAEAPADPFAVPEGRSQDARGY 


3212 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 
AIJCFMMEFRSWCPGW^^1V1ARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 



285 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/US01/04098 



s£Qto 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acfd residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G^GIydne, HHBistidlne, 
Msoleucine, K=Lysine, L>=Leudne, M»Methionine, 
N=Asparagine, P-Proline, Q=GI a taming R-Arginine, S«Serine, 
T^Tbreonine, V-Valine, W°Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, ^possible nadeotide deletion, 
V=posdble nudeotide insertion 










AFQNSSEREDCNNGEPPRKQPEKNSLRQTYNSCA 

RUXNQETVCLASTAMKTENCVAKTKLANGTSS 

MIWKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHUSQMCHYQHGHINSYLKPMLQRDFTTAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSIXjMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVETGSSDSTVRVWDVNTGEMLNTUHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRWRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3213 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTX3ALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENTLSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKnQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVDTGSSDSTVRVWDVKIXjEMLNTLIHHCEA 

VT .HLRFN1 '7 "* -£ *YTCS KDRST A v * ^ 1ASP ; 0 ITL 

• ; i\ : LVGHR \AVN VVDFDDr , . tVC^ " DR'iiKV 

WNTSTCEFVRTLNGHKRGIACI/ 'YRDRLWSGS 

SDNTIRL WDIECGACLRVLEGHL27 ,* 'RCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAF/iG iXCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3214 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

AIJCFMMEFRSWCPGWNTM^RSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RIXLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKI^ASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHY QHGHINS YLKPMLQRDFITAL 

PARGIJ)HIAENILSYLDAKSLCAAELVCKEWYR 

VTSIXjMLWKKLIERMVRTDSLWRGLAERRGWG 

QYIJKNKJPDGNAPPNSFYRALYPKnQDffiTIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VJLJlLKrf^JNOMiVlV 1 CorwIJxvoJL/V V WUMAor IVilL 

RRVLVGHRAAVNWDFDDKYIVSASGDRTIKV 
WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 
SDNTERLWDIECGACLRVLEGHEELVRCIRFDNK 
RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 

oira rcMuuc ui 

peptide 
sequence 


Amino add sequence (A»Alanine OCystdne, D=Aspartfc Add, 
EXIlDtaraic Add, ^Phenylalanine, OCIydne, H^Histidine, 
I-Isolcudne, K«Lystne, L^Leudne, M^MethionJne, 
N^Asparagine, I^ProIine, CHSutamine, R°ArginJne, S=Ser!ne, 
T^Threonine, V»Valine, W=*Tryptophan, Y^TVrosine, 

A"UOAUVWD| "OWp CwUOU) /^pOSolUIC UUCJCUUUC OCICUOD, 

V=possible nndeottde insertion 










LVEHSGRVFRLQFDEFQIVSSSHDDTILrWDFLND 
PAAQSEPPRSPSRTYTYISR 


3215 


A 


2 


1376 


EARLVGCQRGGPARPGSYSSGAETAGRAMAAN 

LSRNGPALQEAYVRWTEKSPTDWALFTYEGNS 

NDIRVAGTGEGGLEEMVEELNSGKVMYAFCRV 

KDPNSGLPKFVLINWTGEGVNDVRKGACASHVS 

TMASFLKGAHVTTNARAEEDVEPECIMEKVAKA 

SGANYSFHKESGRFQDVGPQAPVGSVYQKTNAV 

SEIKRVGKDSFWAKAEKEEENRRLEEKRRAEEA 

QRQLEQERRERELREAARREQRYQEQGGEASPQ 

RTWEQQQEWSRNRNEQESAVHPREIFKQKERA 

MSTTSISSPQPGKLRSPFLQKQLTQPETHFGREPA 

AAISRPRADLPAEBPAPSTPPCLVQAEEEAVYEEP 

PEQETFYEQPPLVQQQGAGSEH1DHHIQGQGLSG 

QGLCARALYDYQAADDTEISFDPENLITGIEVIDE 

GWWRGYGPDGHFGMFPANYVELBE 


3216 


A 


936 


204 


AMASTLEYSPSPLRRLVGPAAGFSRAARADLSW 

DPMAFFTGLWGPFTCVSRVLSHHCFSTTGSLSAI 

QKMTRVR WDNS ALGN SP YHRAPRCIHVYKKN 

GVGKVGDQIIXAIKGQKKXALIVGHCMPGPRMT 

PRFDSNNVVLIEDNGNPVGTRIKTPIPTSLRKREG 

EYSKVLAIAQNFV 


3217 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

V\* AMDG\TSLELGLPRKQ: ~" iOMKAOVTCFVC 

MNVVQKLDK- > .\i3NSSELMITrL\LERVCSV S. f j 

ASnKECIELVDTYSPSLVQLVAKJTPEKVCKFIRI. | 

CGN^RRARAVHDAYAIVPSPEWDAENQGSFCNij 

CKRLLTVSSHNI£SKSTKRDBLVAFKGGCSILPLP 

YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3218 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVAVNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLT1ADLNIQEQCESLGPG1J\VLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

WAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MNWQKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECELVDTYSPSLVQLVAKITPEKVCKFTRL 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 

CKRIXTVSSHNLESKSTKRDILV 

YMQOmFVTQYEPVIJESIJKDMN© 

GACMGPRTPLlXjTIXJCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3219 


A 


1623 


572 


TSAEGWKGCTCTFKDRSKLREHLRSHTQEKWA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
' corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^AlanJne OCystdne, D=Aspartic Add, 
E«=Glutamic Add, ^Phenylalanine, G=Grydnc, H~Hl$tidine, 
JNIsoieudne, K=0Lysine, LHLeudne, M=Methionine, 
N=Asparagine, P=ProIine, Q=G]ntamine t RsArghiine, S= a Serine f 
Threonine, V=VaIlne, W=Tryptophan, Y=Tyrosine, 
X=Unknow0, *=Stop cod on, ^possible nndeotide deletion, 
V=possible nucleotide insertion 










CPTCGGMFANNTKFLDHIRRQTSLDQQHFQCSH 

CSKRFATER1XRDHMRNHVNHYKCPLCDMTCPL 

PSSLRMIMRFRHSEDRPFKCDCCDYSCKNLIDLQ 

KHLDTHSEEPAYRCDFENCTFSARSLCSIKSHYR 

KVHEGDSEPRYKCHVCDKCFTRGNNLTVHLRK 

KHQFKWPSGHPRFRYKEHEDGYMRLQLVRYES 

VELTQQLLRQPQEGSGLGTSLNESSLQGIILETVP 

GEPGRKEEEEEGKGSEGTALSASQDNPSSVIHW 

NQTNAQGQQEIVYYVLSEAPGEPPPVPEPPSGGI 

MEKLQGIAEEPEIQMV 


3220 


A 


2760 


745 


SLGJPSGNTRGTGLVLDGDTSYTYHLVCMGPEAS 

GWGQDEPQTWPTDHRAQQGVQRQGVSYSVHA 

YTGQPSPRGLHSENREDEGWQVYRLGARDAHQ 

GRPTWALRPEDGEDKEMKTYRLDAGDADPRRL 

CDLERERWAVIQGQAVRKSSTVATLQGTPDHGD 

PRTPGPPRSTPLEENVVDREQIDFLAARQQFLSLE 

QANKGAPHSSPARGTPAGTTPGASQAPKAFNKP 

HLANGHVVPIKPQVKGVVREE^VRAVF 

VQWDDPGSLASVESPGTPKETPIEREIRLAQERE 

ADLREQRGIJIQATDHQELVEIPTRPLLTKLSLITA 

PRRERGRPSLYVQRDIVQETQREEDHRREGLHV 

GRASTPDWVSEGPQPGLRRALSSDSILSPAPDAR 

AADPAPEVRKVNRIPPDAYQPYLSPGTPQLEFSA 

FGAFGKPSSLSTAEAKAATSPKATMSPRHLSESS 

GKPLSTKQEASKPPRGCPQANRGWRWEYFRLR 

PLRFRAPDEPQQAQVPHVWGWEVAGAPALRLQ 

KSQSSDLLERERESVLRREQEVAEERRNALFPEV 

FSPTPDENSDQNSRSSSQASGITGSYSVSESPFFSP1 

HLHSNVAWTVEDPVDSAPPGQRKKEQWYAGIN 

PSDGINSEVIJSAIRVTRHKNAMAERWESRIYASE 

FOT> 


3221 


A; 


15 


478 


^^rvTFFFTTTPAFKMSKRGRGGSSGAKi'A^ • OLP 
VGAVINCADNTGAKNLYEtSVKGIKGRLNRLPAA 
GVGDMVMATVKKGKPELRKKVHPAVVIRQRKS 
Y1<BKJXjVFLYFEDNAGVIVNNKGEMKGSAITGP 
VAKECADLWPRIASNAGSIA 


3222 


A 


207 


1321 


PLIPIJIPANRSPATMAEI^EVQITEEKPLLPGQTP 

EAAKTHSVETPYGSVTFTVYGTPKPKRPAILTYH 

DVGLNYKSCFQPLFQFEDMQEEQNFVRVHVDAP 

GMEEGAPVFPLGYQYPSLDQLADMIPCVLQYLN 

FSTnGVGVGAGAYII^\RYALNHPDTVEGLVLINI 

DPNAKGWMDWAAHKLTGLTSSEPEMILGHLFSQ 

EELSGNSELIQKYRNIITHAPN1JDNIELYWNSYNN 

RRDLNFERGGDITLRCPVMLWGDQAPHEDAW 

ECNSKLDPTQTSFLKMADSGGQPQLTQPGKLTE 

AFKYFLQGMGYMASSCMTRLSRSRTASLTSAAS 

VDGNRSRSRTLSQSSESGTLSSGPPGHTMEVSC 


3223 


A 


132 


1664 


SARRWGAAGAGPHGLHLRAHGPRPSVRTGLPSV 

GRQAAGAAMGRGWGFLFGLLGAVWLLSSGHGE 

EQPPETAAQRCFCQVSGY1JDDCTCDVETIDRFNN 

YRLFPRLQKLLESDYFRYYKVNLKRPCPFWNDIS 

QCGRRDCAVKPCQSDEVPDGIKSASYKYSEEAN 

NLIEECEQAERLGAVDESLSEETQKAVLQWTKH 

DDSSDNFCEADDIQSPEAEYVDLLLNPERYTGYK 

GPDAWKIWNVIYEENCFKPQTEKRPOvIPIASGQG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanlne OCysteine, D^Aspartic Add, 
E-Glutamic Add, ^Phenylalanine, G=Glydne, H^HIstidine, 
Msoteudne, K=Lysine, L=Le urine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Gintamhie, R^ArginJne, S=Sertne, 
Threonine, V=Vailnc, W=Tryptophan, Y-Tyrosine, 
X°Un known, *=Stop codon,/=posslble nndeotide ddetion, 
V=possibie nudeotide insertion 










TSEENTFYSWLEGLCVEKRAFYRLISG1JL\SINV 

HI^ARYLIXJETWLEKKWGHNIIEFQQRFDGILTE 

GEGPRRLKNLYFLYLIELRALSKVLPFFERPDFQL 

FTGNKIQDEENKMLLLEILHEIKSFPLHFDENSFF 

AGDKKEAHK1JCEDFRLHFRNISRIMDCVGCFKC 

RLWGKLQTQGLGTALKILFSEKLIANMPESGPSY 

EFHLTRQEIVSU^AFGmSYKCERIRKTSRNLLQ 

NIH 


3224 


A 


2 


803 


PGSTISWDRDAAGESGTRAASPSPSGSRTAGRLP 

SPSYSPLPAPSLFPPPPLPAPAASTMSAGGDFGNP 

LRKFKLVFLGEQSVGKTSLITRFMYDSFDNTYQA 

TIGIDFLSKTMYLEDRTVTILQLWDTAGQERFRSL 

IPSYIRDSTVAWVYDITNLNSFQQTSKWEDDVRT 

ERGSDVIIMLVGNKTDLADKRQITIEEGEQRAKE 

LSVMFIKI^AKTGYNVKQLFRRVASALPGMENV 

QEKSKEGMTOIKLDKPQEPPASEGGCSC 


3225 


A 


3 


5054 


PEVTKPSLSQPTAASPIGSSPSPPVNGGNNAKRVA 

\WGQPPSAARYMPREVPPRFRCQQDHKVIJJKR 

GQPPPPSCMLLGGGAGPPPCTAPGANPNNAQVT 

GALLQSESGTAPDSTLGGAAASNYANSTWGSGA 

SSNNGTSPNPIHIWDKVIVDGSDMEEWPCIASKD 

TESSSENTTDNNSASNPGSEKSTLPGSTTSNKGK 

GSQCQSASSGNECNLGVWKSDPKAKSVQSSNST 

TENNNGLGNWRNVSGQDRIGPGSGFSNFNPNSN 

PSAWPALVQEGTSRKGALETDNSNSSAQVSTVG 

QTSREQQSKMENAGVNFWSGREQAQHNTDGP 

KNGKINSU^SSPNPMENKGMPFGMGLGNTSRS 

TDAPSQSTGDRKTGSVGSWGAARGPSGTOTVSG 

QSNSGNNGNNGKEREDSWKGASVQKSTGSKND 

SWDNN>mSTGGSWNFGPQDSNDNKWGEGNKM 

TSGVSOGEWKQ - - 3DHLK T GEW°^^OPNSST 

GAT- ^N s JKGKPUJBNQGNAQAK.-"^Ii5^: TGS 

EVEGQSTGSNHKAGSSDSHNSGRRJV. 

QAVLQTU^RTDLDPRVLSKTGWGQTOirQDTV 

WDIEEVPRPEGKSDKGTEGWESAATQmr &GG 

WGDAPSQSNQMKSGWGELSASTEWKDPKNTGG 

WNDYKNNNSSNWGGGRPDEKTPSSWNENPSKD 

QGWGGGRQPNQGWSSGKNGWGEEVDQTKNSN 

WESSASKPVSGWGEGGQNEIGTWGNGGNASLA 

SKGGWEDCKRSPAWNETGRQPNSWNKQHQQQ 

QPPQQPPPPQPEASGSWGGPPPPPPGNVRPSNSS 

WSSGPQPATPKDEBPSGWEEPSPQSISRKMDIDD 

GTSAWGDPNSYNYKNVNLWDKNSQGGPAPREP 

NLPTPMTSKSASDSKSMQDGWGESDGPVTGARH 

PSWEEEEDGGVWNTTGSQGSASSHNSASWGQG 

GKKQMKCSIXGGWNDSAVMNPLAKQFSNMGLL 

SQTEDNPSSKMDLSVGSI^DKKFDVDKRAMNLG 

DFNDIMRKDRSGFRPPNSKDMGTTDSGPYFEKG 

GSHGLFGNSTAQSRGLHTPVQPLNSSPSLRAQVP 

PQnSPQVSASMLKQFPNSGLSPGLFNVGPQLSPQ 

/""\T A jrr C/"YT DAIDArAT A /~*/*YI T T AAAAAAAT T S\\T 

QLAMLSQLFQIPQr QlJ\L(jlJJLQQQQx<^ ij <^N 

QRKISQAVRQQQEQQLARMVSALQQQQQQQQR 

QPGMKHSPSHPVGPKPHLDNMVPNALNVGLPDL 

QTKGPIPGYGSGFSSGGMDYGMVGGKEAGTESR 

FKQWTSMMEGLPSVATQEANMHKNGAIVAPGK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AJanlne OCystrine, D=Aspartic Add, 
E=G)utamie Add, ^Phenylalanine, G=Glycine, H-HIstidine, 
Msoleudnt, KHLysine, LHLeudne, M=Methionine, 
T^Asparagine, P==ProIlne, Q=Glutamlne, R^Arginine, S^Serine, 
T^Threonine, V«VaItne, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=pos3ib!e nndeotide insertion 










TRGGSPYNQFDHPGDTLGGHTGPAGDSWLPAKS 

PPTNKIGSKSSNASWPPEFQPGVPWKGIQNIDPES 

DPYVTPGSVLGGTATSPIVDTDHQLIJIDNTTGSN 

SSLNTSLPSPGAWPYSASDNSFTNVHSTSAKFPD 

YKSTWSPDPIGHNPTHI^NKMWKNHISSRNTTPL 

PRPPPGLTNPKPSSPWSSTAPRSVRGWGTQDSRL 

ASASTWSDGGSVRPSYWLVLHNLTPQIDGSTLRT 

ICMQHGPLLTFHLNLTQGTALIRYSTKQEAAKAQ 

TALHMCVLGNTTELAEFATDDEVSRFLAQAQPPT 

PAATPSAPAAGWQSLETGQNQSDPVGPALNLFG 

GSTGLGQWSSSAGGSSGADLAGASLWGPPNYSS 

SLWGVPTVEDPHRMGSPAPLLPGDLLGGGSDSI 


3226 


A 


200 


1387 


VPWKRQDEQLSUJVETLYLDSPAVIHIXSPTFLP 

PSSLPPFLQIVDSSSSACTLDSFFPFLAPWDSPQDC 

GFKDHQPLTLQALTVELARWTLMLLLSTAMY G 

AHAPLIALCHVDGRVPFRPSSAVLLTELTKLLLC 

AFSLLVGWQAWPQGPPPWRQAAPFALSALLYG 

ANNNLVIYLQRYMDPSTYQVLSNLKIGSTAVLY 

CLCLRHRLSVRQGLALLLLMAAGACYAAGGLQ 

VPGNTLPSPPPAAAASPMPLHTTPLGLLLLILYCLI 

SGl^SVYTELLMKRQRLPLALQNLFLYTFGVLLN 

LGLHAGGGSGPGLLEGFSGWAALWLSQALNGL 

LMSAVMKHGSSITRLFWSCSLWNAVLSAVLL 

RLQLTAAFFLATLLIGLAMRLYYGSR 


3227 


A 


1 


679 


RSTRARTRRPGLRAVPLPVGGFLGKMKWVWAL 

LLLAALGSGRAERDCRVSSFRVKENFDKARFSGT 

WYAMAKKDPEGLFLQDNTVAEFSVDETGQMSA 

TAKGRVRLLNNWDVCADMVGTFTOTEDPAKFK 

MKYWGVASFLQKGNDDHWTVDTDYDTYAVQY 

SCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKIV 

RQRQPFLCT \RQYRUVHNG u< 


2228 


A 




H04 


QQl&PAAUAARI*^ ^ ~j 

IJCCVWGDGAVGKT(XIJyISYANDAFPBEYVPT ; 

VFDHYAVTVTVGGKQHLLGLYDTAGQEDYNQL 

RPLSYPNTDVFLICFSWNPASYHNVQEEWVPEL 

KIX^HVFV^UGTQmLRDDPKTlJ^RLLYMKE 

KPLTYEHGVKLAKAIGAQCYLECSALTQKGLKA 

VFDEAILTIFHPKKKKKRCSEGHSCCSn 


3229 


A 


25 


722 


AISAGRSAKMQLKPMEINPEMLNKVLSRLGVAG 

QWRFVDVLGLEEESLGSVPAPACALLLLFPLTAQ 

HENFRKKQIEELKGQEV SPKVYFMKQTIGNSCGT 

IGIJD^VANNQDKLGFEDGSVLKQFLSETEKMSP 

EDRAKCFEKNEAIQAAHDAVAQEGQCRVDDKV 

NFHFILFNNVIXjHLYELIXjRMPFPVNHGASSEDT 

LLKDAAKVCREFTEREQGEVRFSAVALCKAA 


3230 


A 


282 


1479 


GDAATTACAPPDWFLGPRKLAAGPAGGGMLPR 

RLLAAWLAGTRGGGLLALLANQCRFVTGLRVR 

RAQQIAQLYGRLYSESSRRVLLGRLWRRLHGRP 

GHASALMAALAGVFVWDEERIQEEELQRSINEM 

KRLEEMSNMFQSSGVQHHPPEPKAQTEGNEDSE 

OJU^KWnMVMUJOQlFKl>WKRPrrG 

FGTYTDVTPRQFFKVQLDTEYRKKWDALVIKLE 

VffiRDVVSGSEVLHWVTHFPYPMYSRDYVYVRR 

YSVDQENNMMVLVSRAVEHPSVPESPEFVRVRS 

YESQMVIRPHKSFDENGFDYIXTYSDNPQTVFPR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanlne OCysteine, D=AspartJc Add, 
E=Glutamlc Add, ^Phenylalanine, G^Gtydne, H-Histidine, 
MsoleucJne, KpLysine, L^Leudne, M=Methionlnc, 
N=Asparagrae, B=Prollne» Q=Glutamine, R-Arginine, S=Serine t 
T«Threonine, V«Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nudeotide insertion 










YCVSWMVSSGMPDFLEKLHMATLKAKNMEIKV 
KDYISAKPLEMSSEAKATSQSSERKNEGSCGPAR 
BEYA 


3231 


A 


2117 


590 


FVPEPPEAGASSPCAPGDPDMSFRKWRQSKFRH 

WGQPVK^^X3CYEDmVSRVTWDSTFCAV^^KF 

IAVIVEASGGGAFLVLPLSKTGR1DKAYPTVCGH 

TGPVLDroWCPHNDEVIASGSEIXnVMVWQIPE 

NGLTSPLTEPVVVLEGHTKRVGIIAWHPTARNVL 

LSAGCDNWLIWNVGTAEELYRLDSLHPDLIYN 

VSWNHNGSLFCSACKDKSVRIEDPRRGTLVAERE 

KAHEGARPMRAMAIXjKVFTTGFSRMSERQLA 

LWDPENLEEPMALQELDSSNGALLPFYDPDTSV 

VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPQR 

GMGSMPKRGLEVSKCEIARFYKLHERKCEPIVM 

TVPRKSDLFQDDLYPDTAGPEAALEAEEWVSGR 

DADPDLISLREAYVPSKQRDLKISRRNVLSDSRPA 

MAPGSSHLGAPASTTTAADATPSGSLARAGEAG 

KLEEVMQELRALRALVKEQGDRICRLEEQLGRM 

ENGDA 


3232 


A 


3 


718 


RLREDDRRGLPL SSPL WTEPPLSCCLPATYPADM 
GTAGAMQLCWVE^FLUTIGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGV1SFIVILVVWI 
BLVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


3233 


A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAGVPSSGRIXKjTSRDTFQTVPPNSTTMSLS 
:>tREDAmPSPTCE^ "1 
ILVGVVSOO?KCRKSKESEDPQE " S3GLSE6CST 
ANGEKDSITLISMKMNMNNGKQSLSAEKVL 


3234 


A 


1169 


4292 


AGDCGRLGVGGSEFPWEGSALGASPLPPICLQSR 

TWLLRAPAPAELGELEEVAAGRGDVWEPFLDSP 

GREESLQEASPRLADHGSSSGGGWEVKRSQRLR 

RGPSSPRRPYQDMEYERRGGRGDRTGRYGATDR 

SQDDGGENRSRDHDYRDMDYRSYPREYGSQEG 

KHDYDDSSEEQSAEDSYEASPGSETQRRRRRRH 

RHSPTGPPGFPRIXjDYRDQDYRTEQGEEEEEEED 

EEEEEKASNIVMLRMLPQAATEDDIRGQLQSHG 

VQAREVRLMRNKSSGQSRGFAFVEFSHLQDATR 

WMEANQHSLNIIXKJKVSMHYSDPKPKINEDWL 

CNKCGVQNFKRREKCFKCGVPKSEAEQKLPLGT 

RIXKJQTLPLGGRELSQGLLPLPQPYQAQGVLAS 

QALSQGSEPSSENANDITILRNLNPHSTMDSILGA 

LAPYAVLSSSNVRVIKDKQTQLNRGFAFIQLSTIE 

AAQLLQILQALHPPLTIDGKTINVEFAKGSKRDM 

ASNEGSRISAASVASTAIAAAQWAISQASQGGEG 

TWATSEEPPVDYSYYQQDEGYGNSQGTESSLYA 

HGYLKGTKGPGITGTKGDPTGAGPEASLEPGADS 

VSMQAFSRPQPGAAPGIYQQSAEASSSQGTAANS 

QSYTIMSPAVLKSELQSPTHPSSALPPATSPTAQE 

SYSQYPVPDVSTYQYDETSGYYYDPQTGLYYDP 

NSQYYYNAQSQQYLYWDGERRTYVPALEQSAD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystdne, B=Aspartic Add, 
E-=Clutaralc Add, ^Phenylalanine, G=Glydnt, H«HIstidlne, 
Msoleudne, K^Lysine, L^Leudne, M-Methionine, 
N^Asparagine, P=Pro!ine, Q=Glutamlne, R^Arginine, S^Serine, 
T=Threonine, V= Valine, W^Tryptophan, Y=Tyroslne, 
X=Unknown, *«Stop eodon, /=possiWe nudeotidc ddetion, 
\=possibJe ondeotide insertion 










GHKETGAPSKEGKEKXEKHKTKTAQQIAKDME 

RWARSLNKQKENFKNSFQPISSLRDDERRESATA 

DAGYAEJEKKGALAERQHTSMDLPKLASDDRPS 

PPRGLVAAYSGESDSEEEQERGGPEREEKLTDW 

QKLACLLCRRQFPSKEALIRHQQLSGLHKQNLEI 

HRRAHLSENELEA1J5KNDMEQNIKYRDRAAERR 

EKYGIPEPPEPKRRKYGGISTASVDFEQPTRDGLG 

SDMGSRMLQAMGWKEGSGLGRKKQGIVTPrEA 

QTRVRG SGLGARGSSYG VTSTESYKETLHKTMV 

TRFNEAQ 


3235 


A 


3 


1217 


PSFLNTGLGPTALGVLGGAGAGLMSNPSPQVPEE 

EASTSVCRPKSSMASTSRRQRRERRFRRYLSAGR 

LVRAQALLQRHPGLDVDAGQPPPLHRACARHD 

APALCLLLRLGADPAHQDRHGDTALHAAARQG 

PDAYTDFFU>LI^RCPSAMGIKNKIX}ETPGQILG 

WGPPWDSAEEEEEDDASKEREWRQKLQGELED 

EWQEVMGRFEGDASHETQEPESFSAWSDRLARE 

HAQKCQQQQREAEGSCRPPRAEGSSQSWRQQEE 

EQRLFRERARAKEEELRESRARRAQEALGDREP 

KPTRAGPREEHPRGAGRGSLWRFGDVPWPCPGG 

GDPEAMAAALVARGPPLEEQGALRRYLRVQQV 

RWHPDRFLQRFRSQIETWELGRVMGAVTALSQA 

LNRHAEALK 


3236 

i 


A 


3 


1416 


GPASGMAEPTSDFETPIGWHASPELTPTLGPLSDT 

APPRDRWMFWAMLPPPPPPLTSSLPAAGSKPSSE 

SQPPMEAQSLPGAPPPFDAQELPGAQPPFDAQSPL 

DSQPQPSGQPWNFHASTSWYWRQSSDRFPRHQK 

SLNPAVKNSYYPRKYDAKFTDFSLPPSRKQKKK 

KRKEPVFHFFCDTCDRGFKNQEKYDKHMSEHTK 

CPELIXISFTAHEKIVQFHWRNMHAPGMKKIKLD 

TPFKIA^WRF^V^TTI^^^ 

RGAVLTTTQYOi^iv2I' fc , ' MSRHSQMaKIRSPGKNH 

KWKNDNSRQRA\nPGSGSHLCDLKLEGPPEANA 

DPLGVLINSDSESi^KFEKPQHSVIPKEVTPALCSL 

MSSY GSLSGSESEPEE T?1:CTEADVLAENQVLDSS 

APKSPSQDVKATVRNFSEAKSENRKKSFEKTNPK 

REKRLSQLSNVIRTKNTPSISLGNASSSGHST 


3237 


A 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFGRR 

RRRGRWSRKKMSLKSERRGMVDQSDLLCKKG 

CGYYGNPAWQGFCSKCWREEYHKARQKQIQED 

WELAERLQREEEEAFASSQSSQGAQSLTFSKFEE 

KKTOEKTRKVTTVKKFFSASSRVGSKKEIQEAKA 

PSPSINRQTSIETDRVSKEFmFLKTFHKTGQEIYK 

QTKLFLEGMHYKRDLSIEEQSECAQDFYHNVAE 

RMQTRGKWPERVEKEVIIXJIEKYIMTRLYKYVF 

CPETTDDEKKD1AIQKRIRALRWVTPQMLCVPV 

NEDIPEVSDMVVKAITDimMDSKRVPRDKLACIT 

KCSKHIFNAIKITKNEPASADDFLPTLIYIVLKGNP 

PRI^SMQYITRFCOTSRLNrTOEDGYYFTNLCCA 

VAFIEKLDAQSLNLSQEDFDRYMSGQTSPRKQEA 

ESWSPDACLGVKQMYKNLDLLSQLNERQERIMN 

EAKKLEKDLIDWTDGIAREVQDIVEKYPLEIKPP 

NQPLAAIDSENVENDKLPPPLQPQVYAG 


3238 


A 


1373 


449 


VLSVCPTGVFRPAPCRMAFMKKYLLPILGLFMA 
YYYYSANEEFRPEMLQGKKVTVTGASKGIGREM 
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mm 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alanlne OCysteine, T>=Aspartie Add, 
E=Glutamic Add, Phenylalanine, G-Grycine, H=Histidine, 
Msolencine, K*>Lysine, L=Leudne, M^Vletbionlne, 
N=Asparagine, P=Proline, Q=Glatnmine, R«Arginlne, S=Serine, 
T=Threonine, V«Vaiine, W=Tryptophan, Y-Tyrosine, 
X»Unknown, *«Stop codon, /^possible nndeotide deletion, 
V-possible nucleotide insertion 










AYHLAKMGAHVVVTARSKETLQKVVSHCLELG 
AASAHYIAGTMEDMTFAEQFVAQAGKLMGGLD 

Muyfflrrm^LNiJi© 

VLTVAALPMLKQSNGSIVVVSSLAGKVAYPMVA 

AYSASKFALIXjFFSSIKKEYSVSRVNVSriLCVLG 

LIDTETAMKAVSGIVHMQAAPKEECALEIIKGGA 

LRQEEVYYDSSLWTTIJJRI^ 

MDRFINK 


3239 


A 


213 


422 


ERTMQLEIKVALOTIIFYLYNKLLW/QPLKKK*EA 
HWYPDKPLKGSGFHT/GEMVDPVGELAAKRSGL 
TVED 


3240 


A 


1255 


1425 


HESYHVNPNLCNPVAPTSGAHSIG* KWPS WLGA 
VAHSCNPSTLVGRGGRITRGQELR 


3241 


A 


161 


547 


PAGIGRSTAKTPGTPGSLEMENLKSGVYPLKEAS 
GCPGADRNLLVYSFYEKGPLTFRDVAIEFSLEEW 
(^DTAQQDLYRKVMLEhTYRl^VFIAGUVSKP 
DLITCLEQGKEPWNMKRHAMVDQPPGR 


3242 


A 


50 


241 


PLPARGKSTLPATFCSPSAPELASMSWPPNRSQT 
GWPRGVTQFGNKYIQQTKPLTLERTTNL 


3243 


A 


380 


702 


FVAYLKI^FFSQVCLFASSEMFFTISRKNMSQKLS 
LLLLWGLIWGIJVfLLHYTFQQPRHQSSVKLREQI 
LDLSKRYVKALAJEENKNTVDVENGASMAGYGK 
1TVEYF 


3244 


A 


37 


1391 


VLMIX3R1^MRSMRLREEESPGPSHTASCLCGSAP 

CILCSCCPASRNSTVSrU,IFTFFLFLGVLVSIIMLSP 

GVESQLYKLPWVCEEGAGIPTVLQGHIDCGSLLG 

YRAVYRMCFATAAFFFFFTLLMLCVSSSRDPRA 

AIQNGFWFFKFLILVGLTVGAFVTPDGSFTNIWFY 

FGWGSFLFILIQLVLLIDFAHSWNQRWLGKAEE 

CDSRAWAGLFFFTIXFYLLSIAAVALMFMYVT 

LPSGCFF^KVFISLHLTFC^^SIA A ,/ LPKVQ > * . 

QPNSGiXQASVfTLYTMFV IWSSA % XmQKCL & 

HLPTQLGNETWAGPEGYETQWWDAPSIVGLIIF 

IXCTLHSl^SDHRQVNSLMQTEECPPMLDATQ 

QQQQVAACEGRAFDNEQDGVTYSYSFFHFCLVL 

ASLHVMMTLTNWYKPGETRKMISTWTAVW^ 

CASWAGLLLYL 


3245 * 


A 


52 


426 


SSLGNEDDEILSLAKDITGMFVASHRKMRAHQV 
LTFLLLFVTTSVASENASTSRGCGLDLLPQYVSLC 
DLDAIWGIVVEAAAGAGALITLLLMLILLVRLPF 
FKEKEKKSPVGLHFLFLLGTLGP 


3246 


A 


3 


515 


HEVCGSGCCCHCCAGGPVARQKALPRLRGVMS 

RFLNVLRSWLVMVSDAMGNTLQSFRDHTFLYEK 

LYTGKP^WGIXJARTFGIWIIXSSVIRCLCAIDI 

HNKTLYHITLWTrT^LALGHFLSELFVYGTAAFn 

GVIj\PLMVASFSILGMLVGLRYLEVEPVSRQKK 

RN 


3247 


A 


1 


932 


ERLCFPC^QSKIYSYMSPl^CSGMRFPLQEENSV" 

THHEVKCQGKPLAGIYRKREEKRNAGNAVRSA 

MKSEEQKKDARKGPLVPFPNQKSEAAEPPKTPP 

SSCIiSTNAAIAKQAIJCKPKGKQAPRKKA 

QQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELI 

ESGKEEGMKJDLBDGKGRGV1ATKQFSRGDFVVE 

YHGDLffilTDAKKREALYAQDPSTXKYMYYFQY 

LSKTYCVDATRETNRIXjRIJNHSKCGNCQTKLH 
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SEQD> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alantne OCysteine, D^Asparttc Add, 
E^GIutamic Add, ^"Phenylalanine, G=£lydne, EMSstidine, 
INboIeudne, K=Lysine, L^Leudne, M=*Metfajoniiie, 
N=Asparagine, P^Proline, Q=Glutamlne, R^Arglnine, S=Serine, 
Threonine, V-Vallne, W^ryptopban, Y=Tyrosine, 
X=Unknown, *=*Stop codon, /^possible nudeotlde deletion, 
\=possible nndeotide insertion 










DIDGVPHLILIASRDIAAGEELLYDYGDRSKASIE 
AHPWLKH 


3248 


A 


3 


870 


PGST1SCSELKGTQCRATAGSRGRRPPMTCWLRO 

VTATFGRPAEWPGYLSHLCGRSAAMDLGPMRK 

SYRGDREAFEETHLTSLDPVKQFAAWFEEAVQC 

PDIGEANAMCIATCTRDGKPSARMLLLKGFGKD 

or Kr r T N rho KXuKtLDSNPr ASL VFr WEPLNRQ 

VRVEGPVKKLPEEEAECYFHSRPKSSQIGAWSH 

QSSVIPDREYLRKKNEELEQLYQDQEVPKPKSW 

GQYVLYPQVMEFWQGQTNKLHDKLVFRRGLPTG 

DSPLGPMTHRGEEDWLYERLAP 


3249 


A 


43 


1210 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRG 

EEGHDPKEPEQIJIKLFIGGI^FETTDDSLREHFEK 

WGTLTDCWMRDPQTKRSRGFGFVTYSCVEEY 

DAAMCARPHKVDGRVVEPKRAV SREDSVKPG A 

HLTXOCIOFVGGnCEDTEEY 

VMEDRQSGKKRGFAFVTFDDHDTVDKIVVQKY 

HTINGHNCEVKKALSKQEMQSAGSQRGRGGGS 

GNFMGRG GNFGGG GGNFGRGGNF GGRGG YG G 

GGGGSRGSYGGGDGGYNGFGGDGGNYGGGPG 

YSSRGGYGGGGPGYGNQGGGYGGGGGYDGYN 

EGGNFGGGNYGGGGNYNDFGNYSGQQQSNYGP 

MKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


3250 

1 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 
LQTNGCVTTARPWKHIREALQNVHEEVALRYY 
GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 
EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 
ASNVTHHGYIEKLGEAGIKNESHDIVVSNCVINL 
VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 
EEIRTHKVLWGECLGGALYWKELAVI AQKIGFC 
PPRLV '. ; -iiJTlCy JKELHB v s?? T ^rRFVS ^.TF^J 
. HSKTG cTKRCQVl YNG^. rvjIH. . ^'LMFDANl: TFK 
EGEIVEVDEETAAILKNSR:^. ; QDFLIRPIGEKLPTS 
GGCSALELKDITIT>PrlCIAEE?DSMKSRCVPDAA 
GGCCGTKKSC 


3251 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGJDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKIXjEAGIKNESH^ 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

TjTYDT 1/TA\TT 1TIAKT1/TJT TI»17T/" v r\/*VQ'Dl/0 ATCDT T7V 

PrKLV 1 AJNJL1 1 l(^NK£L£K.ViulX^Kr VoAlrKUrK 

HSKTGPTr^CQAm^GGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGC^ALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3252 


A 


1 


574 


PLGSNTAPALRVMVQAWYMDDAPGDPRQPHRP 

DrXiRPVGLEQlJlKLGVLYWKLDADKYEN^ 

KIRRERNYSWMDHTICKDKLPNYEEKIKMFYEE 

GDMVTLPAGIYHRFTVDEKNYTKAMRLFVGEPV 
WTAYNRPADHFEARGQYVKFLAQTA 


3253 


A 


2 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPVLLA 
SLGVGLVTLLGLAVGSYLVRRSRRPQATTLLDPNE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nudeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-Alanlne OCystrinc, D»Aspartic Add, 
tXJiutamic Acid, ^Phenylalanine, G=G!ydne, H=HistidlDe, 
Msoleudne, KpLysine, L=Leucine, M=Methionine t 
N^Asparagine, P^Prollne, Q=Chitamine, R^Arglnine, S=Serine, 
THThreonine, V^Vallne, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possible nudeotide deletion, 
\=possible nudeotide insertion 










KY1XR1XDKTTVSHNTKRFRFALPTAHHTLGLPV 

GrOflYLSTRIIXjSLVIRPYTPVTSDEDQGYVDLVI 

KYYLKGVHPKFPEGGKMSQYLDSLKVGDVVEF 

RGPSGLLTYTGKGHFMQPNKKSPPEPRVAKKLG 

MUGGTGITPMLQLIRAILKWEDPTQCFLLFANQ 

TEKDIILREDLEELQARYPNRFKLWFTLDHPPKD 

WAYSKGFVTADMIREHLPAPGDDVLVLLCGPPP 

MVQ1ACHPNLDKLGYSQKMRFTY 


3254 


A 


1 


968 


LQSAGEGVTHVLILLESPARPVAAVTQVQRRRY . 

HRLSDMSMLAERRRKQKWAVDPQNTAWSNDD 

SKFGQRMLEKMGWSKGKGLGAQEQGATDHIKV 

QVKNNHLGLGATINNEDNWIAHQDDFNQLLAEL 

NTCHGQETTDSSDKKEKKSFSLEEKSKISKNRVH 

YMKFTKGKDLSSRSKTDLDCIFGKRQSKKTPEG 

DASPSTPEENETTTTSAFnQEYFAKRMAALKNK 

PQWVPGSDISBTQVERKRGKKRNKEATGKDVE 

SYLQPKAKRHTEGKPERAEAQERVAKKKSAPAE 

EQLRGPCWDQSSKASAQDAGDHVQPA 


3255 


A 


173 


439 


GSAAMKVKIKCWNGVATWLWVANDENCGICR 

MAFNGCCPDCKVPGDDCPLVWGQCSHCFHMHC 

ILKWLHAQQVQQHCPMCRQEWKFKE 


3256 


A 


2 


377 


TAARRRQKGTAARRRQKGTLEEVVLPPRSCRVF 
WIHSGTTMSKVSFKITLTSDPRLPYKVLSVPESTP 
FTAVLKFAAEEFKVPAATSA1ITNDGIGINPAQTA 
GNVFLKHGSELRIEPRDRVGSC 


3257 


A 


3 


1454 


GCSAAAAGAGSGPWAAQEKQFPPALLSFFIYNPR 

FGPREGQEENKILFYHPNEVEKNEKIRNVGLCEAI 

VQFTRTr^PSKPAKSLHTQKNRQFFNEPEENFWM 

VMVVRNPIIEKQSKDGKPVIEYQEEELLDKVYSS 

VLRQCYSMYKLFNGTF7JCAMEDGGVKLLKERL 

zio^n-iRYLO'^HLQSc: u-^iWGG"^?FFPL^:^.~~J■ 

LKIQSFINw <:IL SLM VK YTAFL YNDQLrw £(JLh£^ ' 

DDMRJLYKYLTTSLFPRHIEPELAGRDSPm JMP 

GNIXJHYGRFLTGPIJ^NDPDAKCRFPKIFVNrD 

DTYEELHLrVYKAMSAAVCFMEDASVHPTLDFC 

RRLDSIVGPQLTVLASDICEQFNINKRMSGSEKEP 

QFKFTmmMNIAEKSTVHMRKTPSVSLTSVHPD 

LMKILGDINSDFTRVDEDEEIIVKAMSDYWWG 

KKSDRRELYVILNQKNA>nLIEVNEEVKKLCATQF 

NN3FFLD - 


3258 


A 


113 


1558 


aprgcsmphrkkkpfiekkkavsfhlvhrsqrd 

plaadesapqrvllptqkidneerraeqrkygvf 

fdddydylqhlkepsgpselipsstfsahnrreek 

eetlvipstgiklpsswasefhedvgllnkaapv 

sgprldfdpdivaaldddfdfddpdnlleddfjl 

qankatgeeegmdiqkseneddsewedvddek 

gdsnddydsagllsdeix:mswgkthraiadhl 

fwseetksrfteysmtssvmrrneqltlhderfe 

kfyeqydddeigaldnaelegsiqvdsnrlqevl 

ndyykekaencvklntl^ledqdlpmneldes 

eeeemitvvleeakekwdcesicstysnlynhpq 

likyqpkpkqirissktgiplnvlpkkgltakqte 

riqmengsdlpkvstqprskneskedkrarkqai 

keerkerrvekkanklafkl^krrqekell^k 

knveglkl 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCystelne, I>=Aspartk Add, 
E^GIutamic Add, Phenylalanine, G=Glydne, H*Hlstfdine, 
I»Isoleudne, K^Lysine, l/*Leudne, M^Metbionine, 
N=»Asparagjne, P=ProIine, Q^GIutamine, R»Arginine, S=Serinc, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, A=possib]e nudeotide deletion, 
\Bpossible nudeotide insertion 


3259 


A 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFLSM 

YLXOYLGIULIIIATISDSHLHTPMYFFLSNLSFA 

DICVTSTTTPKMLNMQTQNKVITYIACLMQMYF 

FILFAGFENFLLSVMAYDRFVAICHPIJ^W 

PHLCGLLVLASWTMSALYSLLQILMVVRLSFCT 

ALEIPHFFCELNQVIQLACSDSFLNHMVIYFTVAL 

LGGGPLTGILYSYSKnSSIHAISSAQGKYKAFSTC 

ASHLSWSLFYGAILGVYLSSAATRNSHSSATAS 

VMYTVVTPMLNPFIYSLRNKDIKRALGIH1XWGT 

MKGQFFKKCP 


3260 


A 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTE 

HGTPKPFRKFDSVAFGESQSEDEQFENDLETDPP 

NWQQLVSREVLLGLKPCEIKRQEVINELFYTERA 

HWTLKVLDQVFYQRVSREGELSPSELRKIFSNLE 

DILQLmGI^QMKAVRKRNETSVIDQIGEDLLT 

WFSGPGEEKLKHAAATFCSNQPFALEMIKSRQK 

KDSRFQTFVQDAESNPLCRRLQLKDIIPTQMQRL 

TKYPIJJJ)NIATYTEWPTEREKVKKAADHCRQIL 

NYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEY 

PTTVnEEIJO^DLTKRKMIHEGPLVWKVNRDKTID 

LYTLLLEDELVLLQKQDDRLVLRCHSKILASTAD 

SKHTFSPVIKLSTVLVRQVATDNKALFVISMSDN 

GAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTG 

LQSPDRDLGLESTLISSKPQSHSLSTSGKSEVRDL 

FVAERQFAKEQHTDGTLKEVGEDYQIAIPDSHLP 

VSEERWALDALRNLGLLKQLLVQQLGLTEKSVQ 

EDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSG 

EGHMPFRTGTGD1ATCY SPRTSTESFAPRDS VGL 

APQDSQASNILVMDHMIMTPEMP7MEPEGGLDD 

SGE v AREAJ;TSDEN?$ EQTGAVi ^EH: WPJL 

M3G>r^ILDGYDPVQv:SoT^VASSLTLQPMT 

GEPAVESTHQQQHSPQN'j ISDGAISPFTPEFLVQQ 

RWGAMEYSC^IQSPSSCAr^QSQIMEYIHKIEA 

DLEHLKKVEESYTILCQRIAGSA^TDKHSDKS 


3261 


A 


1 


2100 


avefaegaltmapwpelgdaqpnpdkylegaa 

gqqptapdksketnktdnteapvtkiellpsyst 

atlideptevddpwnlptlqdsgikwserdtkgk 

ilcffqgigrlilllgflyffvcsldelssafqlvg 

gkmagqffsnssimsnpllglvigvlvtvlvqss 

ststsiwsmvsssixtvraaipiimga>ngtsrrnt 

ivalmqvgdrsefrrafagatvhdffnwlsvlv 

llpvevathyleiitqlivesfhfkngedapdllk 

vtixpftklivqijdkkvisqiamndekaknk^v 

kiwcxtftnktqik\rivpstanctspslcwtdgi 

qnwtmknvtykeni^^ 

illilsllvlcgcldvirvkilgsvlkgqvatvikkt 

intdfpfpfawltgylailvgagmtfivqsssvft 

SALTPLIGIGVITffiRAYPLTLGSMGTTTTAlLAAL 

ASPGNAUISSIXJIALCHFFFNISGILLWYPIPFTRL 

PIRMAKGLGNISAKYRWFAVFYLIIFFFLIPLTVFG 

LSLAGWRVLVGVGVPVVFniLVLCLRLLQSRCPR 

VU>KKI^NWNrl,PLWMRSLKPWDAVVSKFTGC 

FQMRCCCCCRVCCRACCLLCGCPKCCRCSKCCE 

DLEEAQEGQDVPVKAPETFDNITISREAQGEVPA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
seqnence 


Amino add seqnence (A^Alanine OCystdne, D=Aspartk Add, 
E^GIutarale Add, Phenylalanine, G=Glydne, H-Hlstidine, 
l-Isolcudne, KpLysinc, L»Le urine, M=Metbionlnc, 
N»Asparagtne, P^Proline, Q=Glutamine, R=»Arginine, S=Serine, 
"^Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nndeotide deletion, 
\-*posstble nndeotide insertion 










SDSKTECTAL 


3262 


A 


30 


1377 


SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPRGS 

QGKLRRVLVPMSVKPSWGPGPSEGVTAVPTSDL 

GEIHNWTELU}IJ 7 NHTL£ECHVEI^ 

ALYLAMFW GLVENLLVICVNWRGSGRAGLMN 

LYE^NMAIADLGIVLSLPVWMIJEVTLDYTWLWG 

SFSCRFTrTmTVNMYSSIFFLVCLSVDRYVTLTC 

ASPSWQRYQHRVRRAMCAGIWVLSAHPLPEW 

ffiQLVEGPEPMCLFMAPFETYSTWALAVALSTTI 

LGFLLPFPLITVFNVLTACRLRQPGQPKSRRHCLL 

LCAWAVFVMCWLPYHVT1X1XTLHGTHISLHC 

HLVHLLYFFYDVIDCFSMLHCVINPILYNFLSPHF 

RGRLLNAVVHYLPKDQTKAGTCASSSSCSTQHSI 

IITKGDSQPAAAAPHPEPSLSFQAHHLLPNTSPISP 

TQPLTPS 


3263. 


A 


1 


919 


QARSPSVAAMASPQLCRALVSAQWVAEALRAP 

RAGQPLQLLDASWYLPKLGRDARREFEERHIPG 

AAFFDIDQCSDRTSPYDHMLPGAEHFAEYAGRL 

GVGAATHVVIYDASIX^LYSAPRVWWMFRAFG 

HHAVSLLDGGLRHWLRQNLPLSSGKSQPAPAEF 

RAQLDPAFIKTYEDIKENLESRRFQWDSRATGR 

rTlGTEPEPRDGIEPGHIPGTVNIPFTDFLSQEGLEK 

SPEEIRHLFQEKKVDLSKPLVATCGSGVTACHVA 

LGAYLCGKPDVPIYDGSWVEWYMRARPEDVISE 

GRGKTH 


3264 


A 


1 


1398 


ARRSTPRTAPRASATRSAAGTMREIVHIQAGQCG 

NQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERI 

NVYYNEAAGNKYVPRAILVDLEPGTMDSVRSGP 

FGQIFRPDNFVFGQSGAGNNWAKGHYTEGAELV 

DSVLDVVRKESESCDCLQGFQLTHSLGGGTGSG 

MC^,J I^mEE^TI;F^nFSVMFJrK\\ c ^TV" VE 

i* r IsArLbVHQLVBKl^DETYSIDNEAL YD1CFRT; . ! 

KLT. "PTYGDLNHLVSAIMSGVTTCLRFPGQLNA 

DUtKI AVNMVPFPRII^^ 

RALTWHLTQQMFDSKNMMAACDPRHGRYLTV 

AAIFRGRMSMKEVDEQMLNVQNKNSSYFVTEWIP 

NNVKTAVCDIPPRGLKMSATFIGNSTAIQELFKRI 

SEQFTAMFRRKAFLHWYTGEGMDEMEFIEAES 

NMNDLVSEYQQYQDATADEQGEFEEEEGEDEA 


3265 


A 


265 


862 


WWEDARVLGPFHPEEEGHWVMTPSEGARAGTG 

RELEMLDSLLALGGLVLLRDSVEWEGRSLLKAL 

VKXSALCGEQVHILGCEVSEEEFREGFDSDINNR 

LVYHDFF11DPLNWSKTEEAFPGGPLGALRAMCK 

RTDPWVTIAIJDSLSWIXLRLP<mX^ 

HQDSCPGETPPSLFPLIHLPLPRSVPLFLSTLE 


3266 


A 


2 


884 


AAGAGADGREPASERASRAEPPAVAMGQNDLM 

GTAEDFADQFLRVTKQYLPHVARLCLISTFLEDG 

IRMWFQWSEQRDYIDTTWNCGYLLASSFVFLNL 

LGQLTGCVLVLSRNFVQYACFGLFGI1ALQTIAYS 

ILWDLKFLMRNLALGGGLLLLLAESRSEGKSMF 

AGVPTMRESSPKQYMQLGGRVIXVLMFMTLLH 

FDASFFSIVQMVGTALMILVAIGFKTKLAALTLV 

VWLFAINVYrT^AFWTIPVYKPMHDFLKYDFFQT 

MSVIGGLLLWALGPGGVSMDEKKKEW 


3267 


A 


802 


1011 


ASTFCSAWKRRSTAALWWSGSRASRSHPRELGP 
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SEQD> 
NO. 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanlne OCysteine, D=Aspartie Add, 
E=GiBtamic Add, ^Phenylalanine, G*=Glydnc BNHistidine, 
I=Isoleudne, K=Lysine, L^Lcudoc, M=*MethJonlne, 
N— Asparaglne, P-Frcune, Q==Crlutamlne, KF-Argininc, S= 5 Scrinc > 
T«ThreoDine» V-Valine, W=Tryptopban, Y~Tyrosine* 
X=Unknown, *=Stop codon, A=possibIe nudeotide deJetion, 
^possible nucleotide insertion 










LCFWGTAALSIRSMDVLSLFLEHGKLVFASGLSP 
RA 


3268 


A 


490 


679 


EDAWITNPSLSNARSTPSKPLCYTVLKEGQVVGV 
KTIKASNTREKLRPESERRMVKSFGDEVT 


3269 


A 


2 


796 


GSTHASGARPSLKRARSQRGRPLPSRALPSAHKD 
MTTNAGPLHPYWPQHLRLDNFWNDRPTWHILA 

GLFSVTGVL WTTWLLSGRAAVVPLGTWRRLSL 

pwc a v/^nfrrtn ^/Tpntinn/T wct^t t nnn a ttt on 
U Wr A V CUrlrUL V Lc-U W r V JL Y I cUi^X^yJU\lArL,o V£ 

LWKEYAKGDSRYIlXjDNrTVCMETITACLWGPL 
SLWVVIAFLRQHPLRFILQLVVSVGQIYGDVLYF 
LliiJlrvJJvjrl^rlwiiLOxl^ ir YrlVlINAJ-i WLV 
LPGVLVLDAVKHLIHAQSTLDAKATKAKSKKN 




A 

A 


1 *7 


L.JLJ 


KjU 1 OrT^lLJVlo I JLJJo V AoJMLJL/V^lVl VJSjNJUo^oJrLyOlNr 

KYLTKYSRKQVSDEIKKSRRTVESNPIFFKKNKKI 
Q 


5 Jul I 


A 

A 




^3 


ll^ooJjoJjL.r AUL-oii 1 r\tivjlvALj V Kijrv^r rloC.LAj VAo 

GRPCSPSSAG 


ym 


A 


1211 


1450 


FQFIQIELLNILQSLIRNQTQSPYNTTAYPAIDSVIT 

aJ>FSFSCFniTKCFGI^lFPSVIrTLHVYFm 

YCC 


3273 


A 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFPELP 

LPHVPGQESAKRRSARRFLIMSELTKELMELVW 

GTKSSPGLSDTTFCRWTQGFVFSESEGSALEQFEG 

GPCAV1APVQAFLLKKLLFSSEKSSWRDCSQEEQ 

KELLCHTLCDILESACCDHSGSYCLVSWLRGKTT 

EETASISGSPAESSCQVEHSSALAVEELGFERFHA 

LIQKRSFRSLPELKDAVLDQYSMWGNKFGVLLF 

LYSVLLTKGIENIKNEIEDASEPLIDPVYGHGSQS 

LINLLLTGHAVSNVWDGDRECSGMKLLGIHEQA 

AVGFLTLMEA1JIYCKVGSYLK1SKIPYIJ)CLASE 

I a' t*vr r A K J ,'MAL v.% ' 3 rbrAj„ ■ *CK V - ^ . : . -a - 

D>VjFIPR; 1 

DPEGLGIlLLGPFLQEl'FPJDQGSSGPESFr / HYN 
GIJCQSNYNEKVMYVEGTAVVMGFEDPMLQTn 
DTPIKRCLQTKWPYTELLWTTDRSPSLN 


3274 


A 


186 


1358 


RWHRFFKSSAFWPAEVKQPRGGPKTGSRKEGA 

GSRAPQPWRSFCGSVGAEGRMEKLRLLGLRYQ 

EYVTRHPAATAQLETAVRGFSYLLAGRFADSHE 

LSELVYSASNLLVIXI^ILRKEIJIKKLPVSLSQ 

QKLLTWLSVLECVEVFMEMGAAKVWGEVGRW 

LVL\LIQLAKAV1JIMLX^ 

DRETQAQPPDGDHSPGNHEQSYVGKRSNRWRT 

TPLGLQETIAEFLYIARPLLHLLSLGLWGQRSWK 

TIUJLYYLLRSPFYDRFSEARILFLLQLLADHVPG 
VGLVTRPLMDYLPTWQKIYFYSWG 




A 

A 


D J D 


/ 


HESQFTPQMMPLSAPSRAEELGQRPG 


3276 


A 


1 


258 


KAAGHRL1XAAGHPSMPSSDCLLWEGSLELRPL 
OHISSLLVLVSTTCLF AFPR VPIAFESKSCLIYHCH 
CAFTVRHYMCSSHTG 


3277 


A 


9 


2221 


KLGVEPEEEGGGDDEEDAEAWAMELADVGAAA 

SSC^VHDQVIJTPNASSRVIVHVDLDCFYAQVE 

MSNPELKDKP^VC^KYXWTChTYEARKlXj^ 
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SEQU> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanioc OCystdne, D*=Aspartic Add, 
E=Glutamk Add, ^Phenylalanine, G=Ctyelne, H=Histidlne, 
I=Isolendne, K«Lysine, I>Lendne, ^Methionine, 
N-Asparagine, P=Proline, Q=Glutamlne, R^Arginlne, S-Serine, 
T=-Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, A=possible oudeotide deletion, 
V=possible nudcotide insertion 










KLMNVRDAKEKCPQLVLVNGEDLTRYREMSYK 

VTELLEEFSPVVEIU,GFDENFVDLTEMVEKRLQQ 

LQSDELSAVTVSGHVYNNQSINIJLDVLHIRLLVG 

SQIAAEMREAMYNQLGLTGCAGVASNKLLAKL 

VSGVFKWQQTVLLPESCQHLfflSLNHKEIPGIG 

YKTAKCLEALGINSVRDLQTFSPKILEKELGISVA 

QRIQKLSFGEDNSPVE^GPPQSFSEEDSfXKCSSE 

VEAKNK1EELLASLLNRLCQDERKPHTVRLIIRRY 

SSEKHYGRESRQCPIPSHVIQKLGTGNYDVMTPM 

VDILMKLFRNMVNVKMPFmT^ 

NTAKKGLIDYYLMPSLSTTSRSGKHSrTCNIKDTH 

MEDFPKDKETNRDFLPSGRJESTRTRESPLDTTNF 

SKEKDINEFPLCSLPEGVDQEVFKQLPVDIQEEIL 

SGKSREKFQGKGSVSCPLHASRGVLSFFSKKQM 

QDBPINPRDHLSSSKQVSSVSPCEPGTSGFNSSSSS 

YMSSQKDYSYYLDNRLKDERISQGPKEPQGFHF 

TNSNPAVSAFHSFPNLQSEQLFSR2OTTTDSHKQT 

VATDSHEGLTENREPDSVDEKJTFPSDIDPQVFYE 

LPEAVQKELLAEWKRTGSDFHIGHK 


3278 


A 


1 


876 


GLRLHVDLVEKPRTGIMAAETRNVAGAEAPPPQ 

KRYYRQRAHSNPMADHTLRYPVKPEEMDWSEL 

YPEFFAPLTQNQSHDDPKDKKEKRAQAQVEFAD 

IGCGYGGLLVELSPLFPDTLILGLEIRVKVSDYVQ 

DRIRALRAAPAGGFQNIACLRSNAMKHLPNFFY 

KGQLTKMFFLFPDPHFKRTKHKWRnSPTLLAEY 

AYVLRVGGLVTTITDVLELHDWMCTHFEEHPLF 

ERVPLEDLSEDPWGHLGTSTEEGKKVLRNGGK 

NFPAIFRRIQDPVLQAVTSQTSLPGH 


3279 


A 


82 


2929 


TRTKRRLGREKAMASPPRGWGCGELLLPFMLLG 

TLCEPGSGQIRYSMPEELDKGSFVGNIAKDLGLE 

PQ~7 AERGVRrVSRGRTOL FALNPRSG3L V ■ -.081 

DRE^LCAQSPLCWl^FNiLVEI UwMIOYGVBYEn 

DINDNFPRFRDEEIJCVKVNENAAAGTRLVLPFA 

RDADVGVNSLRSYQLSSNLHFSLDWSGTDGQK 

YPELVLEQPLDREKETVHDLLLTALDGGDPVLSG 

TTHIRVTVLDANDNAPLPTPSEYSVSVPENIPVGT 

RLLMLTATDPDEGINGKLTYSFRNEEEKISETFQL 

DSNLGEISTLQSLDYEESRFYLMEWAQDGGAL 

VASAKVWTVQDVNDNAPEVILTSLTSSISEDCL 

PGTVIALFSVHDGDSGENGEIACSIPRNLPFKLEK 

SVDNYYHLLTTRDLDREETSDYNITLTVMDHGT 

PPLSTESHIPLKVADVNDNPPNFPQASYSTSVTEN 

NPRGVSIFSVTAHDPDSGDNARVTYSLAEDTFQG 

APLSSYVSINSDTGVLYALRSFDYEQLRDLQLWV 

TASDSGNPPl^SNVSI^U^IXJNDNTPEILYPAL 

PTDGSTGVELAPRSAEPGYLVTKWAVDKDSGQ 

NAWLSYRLLKASEPGLFAVGLHTGEVRTARALL 

DRDALKQSLWAVEDHGQPPLSATFTVTVAVAD 

RIPDEADLGS1KTPIDPEDLDLTLYLWAVAAVS 

CVFLAFVIX^LVLRLRRWHKSRLLQAEGSRLAG 

VP a cuttv nvrvsvw a t?i r\TVQTTD\/cT tahcdvcu 
VrAoxir VuVJLAj VK/vr i I oxxc V oLr 1 AJUoxvJSjbrl 

LIFPQPWADTLLSEESCEKSEPLLMSDKVDANK 
EERRVQQAPPNTDWRFSQAQRPGTSGSQNGDDT 
GTWPNNQFDTEMLQAMILASASEAADGSSTLGG 
GAGTMGLSARYGPQFTLQHVLQGELGSDYRQN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alanf ne OCystdne, D=Aspartic Add, 
&=GIatamIc Add, ^Phenylalanine, OGlydne, H=HUtidine, 
I-Isoleudne, K=Lysine, LHLeudne, M^Metbionine, 
N=Asparaglne, P^Proline, Q=GlutaniIne, R^Arginine, S^Serlne, 
T<=OTirtonine, V«Valine, W^^TryptopnaD, Y-Tyrosine, 
X r= Un known, * ss Stop codon, ^possible nudcotidc deletion, 
V=possible nudeotide insertion 










VYIPGSNATLTNAAGKJUDGKAPAGGNGN^ 
GKKEKK 


3280 


A 


149 


1288 


GTSQMSSHKGSWAQGNGAPASNREADTAELAE 

LGP1XEEKGKRVIANPPKAEEEQTCPVPQEEEEE 

VRVLTI^LQAHHAMEKMEEFVYKVWEGRWRVI 

PYDVLPDWLKDNDYLLHGHRPPMPSFRACFKSIF 

RJHTETGMWTHLLGFVLFLFLGILTMLl^NMYF 

MAPLQEKVWGMFFLGAVLCLSFSWLFHTVYCH 

SEKVSRTFSKLDYSGIALLIMGSFVPWLYYSFyCS 

PQPRLIYI^IVCVLGISAIIVAQWDRFATPKHRQT 

RAGWLGLGLSGVVPTMHFTIAEGFVKATTVGQ 

MGWFFLMAVMYITGAGLYAARIPERFFPGKFDI 

WFQSHQIFHVLVVAAAFVHFYGVSNLQEFRYGL 

EGGCTDDILL 


3281 


A 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEKLA 

KIX?AQVRIGGKGTARRKKKVVHRTATADDKKL 

QSSIJCKIAVNMAGIEEVNMIKDIXjTVIHFb^ 

VQASLSANTFAITGHAEAKPITEMLPGILSQLGAD 

SLTSLRKLAEQFPRQVLDSKAPKPEDIDEEDDDV 

PDLVENFDEASKNEAN 


3282 


A 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPA 

LAPGAAAFAGLGGAPRFPPRGSAAGRTMLLKEY 

RICMPLTVDEYKIGQLYMISKHSHEQSDRGEGVE 

WQNEPFEDPHHGNGQFTEKRVYLNSKLPSWAR 

AVWKIFYVTEKAWNYYPYTITEYTCSFLPKFSIH 

IETKYEDNKGSNDTIFDNEAKDVEREVCF1DIACD 

EIPERYYKESEDPKHFKSEKTGRGQLREGWRDSH 

QPIMCSYKLVTVKFEVWGLQTRVEQFVHKVYR 

DILLIGHRQAFAWVDEWYDMTMDDVREYEKN 

MHEQTNDCVCNQHSSPVDDIESHAQTST 


i 3283 


A. , 


159 




IKSKLNQQVRVQKSEV^^\^T3AKG y TMG t '? <: .^"T 
DSG&AAV. .s:VVGGWAVGTVLVALSAlsk* i" S V 
GIAASSIAAKMMSTAAIANGGGVAAGSLVA LQS 
VGAAGLSVTSKVIGGFAGTALGAWLGSPPSo . 


3284 


A 


227 


637 


TSNS1XRPDRMSVMDLANTCSSFQSDLDFCSDCU 

svlplpgaqdtvtcircg™invrdfegkvvkts 

wfhqlgtampmsveegpecqgpwdrrcprcg 

hegmayhtrqmrsadegqtvfytctncio^qek 

EDS 


3285 


A 


123 


1535 


HRI^YDEAFAMANDPLEGFHEVNLASPTSPDLL 

GVYESGTQEQTTSPSVIYRPHPSALSSWIQANAL 

DVSELPTQPVYSSPRRLNCAEISSISFrTVTDPAPCS 

TSGWAGLTKLTTRKDNYNAERErXQGATITEAC 

DGSDDFGLSTDSLSRLRSPSVLEVREKGYERLKE 

ELAKAQREIJQ^KDEECERLSKVRIDQLGQELEEL 

TASLFEEAHKMVREANIKQATAEKQLKEAQGKI 

DVLQAEVAALKTLVLSSSPTSPTQEPLPGGKTPF 

KKGHTRNKSTSSAMSGSHQDI^VIQPIVKDCKEA 

DLSLYhnEFRLWKDEPTMDRTCPFLDKIYQEDIFP 

CLTFSKSELASAVLEAVENNTLSEEPVGLQPIRFV 

KASAYECGGPKKCALTGQSKSCKrflUKLGDSS^ 

YYYISPFCRYRITSVCNFFTYIRYIQQGLVKQQDV 

DQMFWEVMQLRKEMSLAKLGYFKEEL 


3286 


A 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHGYP 
GITEELLRSQLYPEVPPEEFRPFLAKMRGILKSIAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanlne OCysteine, D»Asparuc Add, 
&=Glutamlc Add, ^Phenylalanine, OGrydne, H=Histidint, 
Hsoleudne, K^Lystne, LHLeadne, M=Methionine, 
N=Asparagine, P*=Pro!ine, Q=Glutamine, R^Arginine, S=Serine, 
T-Threonlne, V»Valine, W=Tryptophan, Y=»Tyrosine, 
X^Unknown, *=Stop codon, ^possible nudeotide ddetion, 
^possible nudeotide insertion 










ADMDFNQLEAFLTAQTKKQGGITSDQAAVISKF 
WKSHKTKIRESLMNQSRWNSGLRGLSWRVDGK 
SQSIfflSAQIHTPVAJDOELELGKYGQESEFLCLEFD 
EVKVNQILKTLSEVEESISTLISQPN 


3287 


A 


50 


390 


LGAMAKHHPDLIFCRKQAGVAIGRLCEKCDGKC 
VICDSYVRPCTLVRICDECNYGSYQGRCVICGGP 
GVSDAYYCKECTTQEKDRDGCPKIVNLGSSKTDL 
FYERKKYGFKKR 


328S 


A 


3 


428 . 


RTTFrTOnRPCESlXGDMKIXTH^ 

RGFPLRlXJATEVRICPVEF>n 3 NFVARMIPKVEWS 

AFLEAADNLRLIQVPKGPVBGYEENEEFLRTMH 

HLLLEVEVIEGTLQCPESGRMFPISRGIPNMLLSE 

EETES 


3289 


A 


1 


1743 


AGCCRDTRFPTPRGPGSLCHNFCRSAACTVTRTI 

HGSPREDTGTPRSREMMFQDSVAFEDVAVSFTQ 

EEWALLDPSQKNLYRDVMQETFKNLTSVGKTW 

KVQNIEDEYKNPRRl^SLMREKLCESKESHHCG 

ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGH 

SSLNTHIRADTGHKSSEYQEYGENPYRNKECKK 

AFSYUDSFQSHDKACTKEKPYDGKECTETFISHS 

CIQRHRVMHSGDGPYKCKFCGKAFYFLNLCLIH 

ERIHTGVKPYKCKQCGKAFTR5TTLPVHERTHTG 

VNADECKECGNAFSFPSEIRRHKRSHTGEKPYEC 

KQCGKVFISFSSIQYHKMTHTGEKPYECKQCGK 

AFRCGSHLQKHGRTHTGEKPYECRQCGKAFRCT 

SDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQI 

HERTHSGEKPHECKECGKVFKYFSSLRIHERTHT 

GEKPHECKQCGKAFRYFSSLHIHERTHTGDKPYE 

CKVCGKAFTCSSSIRYHERTHTGEKPYECKHCGK 

AHSNYIRYHERTHTGEKPYQCKQCGKAFIRASS 

CREHE?TrTnMP. 


3290 


A 


2 


: ,3^0 


GRPRSSSDNlh -3*LtO£t AGLSSAAVQTRIGNSAAS 

RRSPAARPPVPArPALPRGRPGTEGSTSLSAPAVL 

WAVAVVVVVVSAVAWAMA>rmVPPGSPEVP 

KLNVTVQIX^EEHRdffiGAI^LLQHLRPHWDPQE 

VTLQLFTDGITNKLIGCYV GNTMEDWLVRIYGN 

KTElXVDRDEEVKSrTlVLQAHGCAPQLYCTFNN 

GLCYEHQGEAIJDPKHVCNPAIFRLIARQLAKIHA 

IHAHNGWKSNLWLKMGKYFSLIPTGFADEDIN 

KPO^DIPSSQIl^EEMTWMKEILSNLGSPVVLCH 

NDLLCKNIIYNEKQGDVQFIDYEYSGYNYLAYDI 

GNHFNEFAGVSDVDYSLYPDRELQSQWLRAYLE 

AYKEFKGFGTEVTEKEVEILHQVNQFALASHFF 

WGLWALIQAKYSTTEFDFLGYAIVRFNQYFKMK 

PEVTALKVPE 


3291 


A 


102 


839 


PEAQTSAYLAREKGHLPTMRHEAPMQMASAQD 

ARYGQKDSSDQNFDYMFKLLHGNSSVGKTSFLF 

RYADDSFTSAFVSTVGIDFKVKTWKNEKRIKLQI 

WDTAGQERYRTITTAYYRGAMGFILMYDITNEE 

SFNAVQDWSTQIKTYSWDNAQVILVGNKCDME 

DERVISTERGQHLGEQLGFEFFETSAKDNINVKQ 

TFERLVDnCDKMSESLETDPAITAAKQNTRLKET 

PPPPQPNCAC 


3292 


A 


2 


4136 


DRPPWNSRVDDFVTNLIHLSSKGHISPAKDTSLQ 
QRTPAEMSPVU1FYVRPSGHEGAASGHTRRKLQ 
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SEQID 
NO: 


Method 


Predicted 
beginning * 
nucleotide 
location 
corresponding ' 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanlne OCysteine, D=»Aspartfc Add, 
E«=Glutaraic Add, F»Pnenyla!anlne, OGryeine, H-Histidine, 
I=Isoleueine, K=Lysine, LHLeudne, M<=Methionine, 
N=Asparagine, P=Proline, Q^Glotamine, R=Arginine, S=Serine, 
T-Threonioe, V^VaUne, W-Tryptophan, Y^Tyrosine, 
X=Un known, *™Stop eodon, /^possible nndeotide deletion, 
V*possible nudeotide insertion 










GKU'ELQGVETELCYNVNWTAEALPSAEETKKL 

MWLFGCPLLLDDVARESWLLPGSNDLLLEVGPR 

LNFSTPTSTNIVSVCRATGLGPVDRVETTRRYRLS 

FAHPPSAEVEAIALATLHDRMTEQHFPHPIQSFSP 

ESMPEPLNGPINILGEGRLALEKANQELGLALDS 

WDLDFYTKRFQELQRNPSTVEAFDLAQSNSEHS 

RHWFFKGQLHVDGQKLVHSLFESIMSTQESSNP 

NNVLKFCDNSSAIQGKEVRFLRPEDPTRPSRFQQ 

QQGLRHVVFTAETHOTPTGVCPFSGATTGTGGRI 

RDVQCTGRGAHWAGTAGYCFGNLHIPGYNLP 

WEDLSFQYPGNFARPLEVAIEASNGASDYGNKF 

GEPVI^GFARSLGLQLPIXjQRREWIKPIMFSGGI 

GSMEADHISKEAPEPGMEWKVGGPVYRIGVGG 

GAASSVQVQGDNTSDLDFGAVQRGDPEMEQKM 

NRVIRACVEAPKGNPICSLHDQGAGGNGNVLKE 

LSDPAG AEYTSRFQLGDPTLNALEIWG AEY QESN 

ALLLRSPNRJDFLTHVSARERCPACFVGTITGDRRI 

VLVDDRECPVRRNGQGDAPPTPPPTPVDLELEW 

VLGKMPRKEFFLQRKPPN1LQPLALPPGLSVHQA 

LERVLRLPAVASKRYLTNKVDRSVGGL VAQQQC 

VGPLQTPLADVAWALSHEELIGAATALGEQPV 

KSIXDPKVAARI^VAEALTNLWALVTDLRDVK 

CSGNWMWAAKLPGEGAALADACEAMVAVMA 

ALGVAVIXjGKDSLSMAARVGTETVRAPGSLVIS 

AYAVCPDITATVTPDLKHPEGRGHLLYVALSPG 

QHRLGGTALAQCFSQLGEHPPDLDLPENLVRAFS 

rTQGLLKDRLLCSGHDVSDGGLVTCLLEMAFAG 

NCGLQVDVPVPRVDVLSVLFAEEPGLVLEVQEP 

DLAQVLKRYRDAGLHCLELGHTGEAGPHAMVR 

VSVNGAWLEEPVGELRALWEETSFQLDRLQAE 

PRCVAE^RGI RERMGPSYCLPP : "?KAWPT^ 

GGPSPRVAILREEGS ^ ^r^EMAD. iFHlAGFEVW * 

DVTMQDLCSGAIGLD1TOGVAFVGGFSYADVLG 

SAKGWAAAVTFHPRAGAELRRFRKRPDTFSLGV 

CNGCQLLALLGWVGGDPNEDAAEMGPDSQPAR 

PGLLLRHNLSGRYESRWASVRVGPGPALMLRG 

MEGAVLPVWSAHGEGYVAFSSPELQAQIEARGL 

APLHWADDDGNPTEQYPLNPNGSPGGVAGICSC 

DGRHLAVMPHPERAVRPWQWAWRPPPFDTLTT 

SPWLQLFINARNWTLEGSC 


3293 


A 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRERV 

AMHYQMSVTIJCYEIKKLIYVHLVI^ 

VGHIJRLI^HDQVAMPYQAVEYPYII^ILPSLLGLL 

SFPRNNISYLVLSMISMGLFSIAPLIYGSMEMFPA 

AQQLYRHGKAYRFLFGFSAVSIMYLVLVLAVQV 

HAWQLYYSKKLLDSWFTSTQEKKHK 


3294 


A 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRA 

WSAGGPALGLMAAPVRLGRKRPLPACPNPLFVR 

WLTEWRDEATRSRHRTRFVFQKALRSLRRYPLP 

LRSGKEAKJLQHFGDGlXmiLDERLQRHRTSGG 

DHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQP 

KAGGSGSYWPARHSGARVILLVLYREHLNPNGH 

HFLTKEELLQRCAQKSPRVAPGSARPWPALRSLL 

HRNLVLRTHQPARYSLTPEGLELAQKLAESEGLS 

LLNVGIGPKEPPGEETAVPGAASAELASEAGVQQ 
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SJEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCysteine, D=Aapartie Add, 
&=Glntamtc Add, F=Phenylalaoine, OGlydoe, H=Hlstidine, 
f=Isoieuciae, K>Lysiue» l>Leudne, M=M etnionfne, 
N^Asparagtoe, P^ProIine, Q=*Glutamint, R»Arginioe, S»=Serine, 
T^Tbreonine, V»VaIlne, W«Tryptophao, Y^Tyrosine, 
X^Unknown, *=Stop codon, A=pcs$ible nudeotide deletion, 
V=possible nudeotide insertion 










QPLELRPGEYRVLLCVDIGETRGGGHRPELLREL 

QRLHVTHTVRKIlHVGDFVWVAQETNPRDPANP 

GEL\O.DHIVERKRLDDLCSS1IIXjRFREQKFRLKR 

CGLERRVYLVEEHGSVHKLSLPESTIXQAVTNTQ 

VnXjFFVKRTADlKESAAYLALLT^ 

TLRSRPWGTPGNPESGAMTSPNPLCSLLTFSDFN 

AGAIKNKAQSVREVFARQLMQVRGVSGEKAAA 

LVDRYSTPASLLAAYDACATPKEQETLLST1KCG 

RLQRNLGPALSRTLSQLYCSYGPLT 


3295 


A 


2 


1115 


EFHPHTQVSGLLTPQLQEPDVWSPSRGQPVSLHL 

PGKGAPEVKEMAWWKSWffiQEGVTVKSSSHFN 

PDPDAETLYKAMKGIGTNEQAIIDVLTKRSNTQR 

QQIAKSFKAQFGKDLTETLKSELSGKFERLIVAL 

MYPPYRYEAKELHDAMKGLGTKEGVIIEILASRT 

KNQLREIMKAYEEDYGSSLEEDIQADTSGYLERI 

LVCLLQGSRDDVSSFVDPALALQDAQDLYAAGE 

KRGTDEMKITIUCTRSAra.LRVFEEYEKL^ 

SITOSlKSETHGSLEEAMLTVAaCCTQNUISYFAE 

RLYYAMKGAGTRDGTLIRN1VSRSE1DLNLIKCH 

FKKMYGKTLSSMIMEDTSGDYKNALLSLVGSDP 


3296 


A 


1 


838 


GTRGGVGPGDNGGVEAGAKPGAAAIPLRGDGS 

GETGPGRVAPGEVRGSPRGHVAGPEGPREVLFFF 

FLPSSKPASEVINEYSWKVDFLKGMLQAEKLTSS 

SEKALANQFLAPGRVPTTARERWATKTVHLQS 

RARYTSEMRSELL GTDSAEPEMD VRXRTG V AG S 

QPVSEKQSAAELDLVLQRHQNLQEKLAEEMLGL 

ARSLKTNTLAAQSVIKKDNQTLSHSLKMADQNL 

EKIJCTESERLEQHTQKSVNWLLWANdLIIVCFIFIS 

MILFIRJMPKLK 


3297 


A 


46 


617 

* 


HKQPAGFLGLWLGTETYTISFPGPETFGLGLSHA 
TGIPG ^PACROF n 'OLHSL.t nsfv^ MA MVS AMS ^ 
VLYL wISAC. ' ^GSLQitiTFQ^HriLHRPEGG 
TCBVIAAHRCCr^miEERSQTVKCSCLPGKVAG 
TTRNRPSCVDA3J V 7 GKWWCEMEPCLEGEECKTL 
PDNSGWMCATGNI^IIliTRIHPRT 


3298 


A 


157 


748 


IQPPDPRNMTLAAYXEKMKEIPLVSIJCSCFLAD 

PLNKSSYKYEADTVDLNWCVIS 

SGQSFEVILKPPSFDGVPEFNASLPRRRDPSLEEIQ 

KKLEAAEERRKYQEAEL1JCHLAEKREHEREVIQ 

KjUEENNNFIKMAKEKLAQ^ 

AMIJERIX^EKDKHAEEVRKNKELKEEASR 


3299 


A 


5 


892 


TQLPAPLSGVLSRLQLGSGAPLLTWVQETAGVA 
GGAPRRRTPVTMWRLLARASAPLLRVPLSDSWA 
LLPASAG\^TLLPVPSFEDVSIPEKPKLRFIERAPL 
VPKVRREPKNLSDIRGPSTEATEFTEGNFAILALG 

/iovt TTVI T/~> TTTTC A /TA vTO T TTXT"D O A /fT"\"D WTA ^TC A TTI 7T> ^ TO 

GuYLH WGHrEMMRL 1 lriRj^MDrlsXiMr A1WRVP 
APFXPITRKSVGHRMGGGKGA1DHYVTPVKAGR 
LWEMGGRCEFEEVQGFLDQVAHKLPFAAKAVS 
RGTLEKMRKIXJEERERNNQNPWTFERIATANML 
GIRKVLSPYDLTHKGKYWGKFYMPKRV 




A 

A 


o 
z 


io*r / 


m/A nnT>Q nofto a A TTTA/TPTT TO VTPT O A nOTW/^P Q 
Jr V AUUrlvUoUonAE 1 IVir JCilixv 1 rlAJAOV^U V \Jj\o 

cilvsiagknvmlix:gmhmgfnddrrfpdfsyi 
tqngrltdfu)cviishfhldhcgalpyfsemvg 
y1x3piymthptqaicpilledyrkiavdkkgean 
fftsqmikdcmkkwavhlhqtvqvddeleika 
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SKQXD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCystelne, D-Aspartic Add, 
E«GIutamic Add, ^Phenylalanine, G-Glycine, H^Histidine, 
I^Isoieudne, KpLysine, L^Leudne, M-Methionrae, 
N^Asparagjne, PNProllne, Q=Glutamlnt, R=Argfnine, S"Serine, 
T^Threonlne, V^VaUne, W^Tryptophan, Y=Tyroslne, 
X=4Jnknown, **-Stop codon, /=passlble nudeotide deletion, 
Wpossible nudeotide insertion 










YYAGHVLGAAMFQIKVGSESVVYTGDYNMTPD 

RHLGAAWTOKCRPNIXITESTYATTIRDSKRCRE 

RDFLKKVHETVERGGKVLIPVFALGRAQELCILL 

ETFWERMNLKWIYFSTGLTEKANHYYKLnPWT 

NQKIRKTFVQIWMFEFKHIKAFDRAFADNPGPM 

VWATPGMLHAGQSLQIFRKWAGNEKNMVIMP 

GYCVQGTVGHKILSGQRKLEMEGRQVLEVKMQ 

VEYMSFSAHADAKGIMQLVGQAEPESVLLVHGE 

AKKMEFUCQmQELRVNCYMPANGETVTLPTS 

PSPVGISlX3IJLKREMAQGLLPEAKXPRLIJiGTLI 

MKDSNFRLVSSEQALKELGLAEHQLRFTCRVHL 

HDTRK£QETALRWSHIJCSV1JCDHCVQHLPDGS 

VTVESVLIXJAAAPSEDPGTKVLLVSWTYQDEEL 

GSFLTSLLKKGLPQAPS 


3301 


A 


2 


349 


CIRTEPAAAFRRLGALSGAAALGFASYGAHGAQ 
FPDAYGKEIJ^KANKHHFLHSLALIXjWHCRIG* 
LWAGLLIASGTTLFCTSFYYQALSGDPSIQTLAP 
AGGTLLLLGWLALAL 


3302 


A 


59 


1184 


LRRNCSALGGLFQTOSDMKGSYPVWEDFINKAG 

KLQSQ1JITTVVAAAAFLDAFQKVADMATNTRG 

GTREIGSALTRMCMRHRSEEAKLRQFSSALIDCLI 

NPLQEQMEEWKKVANQLDKDHAKEYKKARQEI 

KKKSSDTLKLQKKAKKGRGDIQPQLDSALQDVN 

DKYLLLEETEKQAVRKALDSERGRFCTFISMLRP 

YTEEEISMLGEITHLQTISEDLKSLTMDPHKLPSSS 

EQVILDLKGSDYSWSYQTPPSSPSTTMSRKSSVC 

SSLNSVNSSDSRSSGSHSHSPSSHYRYRSSNLAQQ 

APVRLSSVSSHDSGFISQDAFQSKSPSPMPPEAPN 

QRRKEKREPDPNGGGPTTASGPPAAAEEAQRPRS 

M 


?303 








£ a^GGPGKI VSWFSGPGS^COTQRR* V . *r~ C 
KSSLLPPSQDFVAGLSVn J', TDDRLi WAFNLY 
DLNKIXK3TKEEMLDIMKSIYDMMGKYTYPALR 
EEAPREHVESFFQKMDRNKDGVVTIEEFIESCQK 
DENIMRSMQLFDNVI 


3304 


A 


40 


432 


ISEAASGAFQAR*FYQM\LEQKTDALGKQSVNRG 
FTKDKTLSSIFKEEMVKEKTAEEIKQIWQQYFAA 
KDTVYAVIPAEKFDLIWNRAQSCPTFLCALPRRE 
GYEFFVGQWTGTELHFHCTYKYSDPEGKA 


3305 


A 


2 


483 


LDACSTGPYSRSTHASADAWADAWVVVVLKVV 
GMTLFIXYFrXJIFhOCSNDGKl'rrRSYGTVSQIFGS 
RSPSPNGFITTRSYGTVCPKDWEFYQARCFFLIHL 
*\SSWNESWDF(^GKGCTLAJVDNSETLKLLHDL 
HDAEKNYIALPYRSSKYMSTCNGTF 


3306 


A 


2 


872 


TLSSACLIGDAWKELTWAGAVSNQLLVWYPAT 

A1ADNKPVAPDRRISGHVGEFSMSYLESKGLLA 

TASEDRSVRIWKGGDLRVPGGRVQNIGHCFGHS 

ARVWQVKLLENYLISAGEDCVCLVWSHEGEILQ 

AFRGHQGRGIRAIAAHERQAWVTTGGDDSGIRL 

WHLVGRGYRGLG/DLGSLLQVP**ARYTQGCDS 

GWLLATAGSD*YRGPVSL*RRGQVLGAAARG*T 

FPVLLPAGGSSWSRGLRIVCY GQ WGRSCQGCPH 

QHSNCCCGPDPVSWEGAQLELGPAWL 


3307 


A 


2 


927 


RTSRVEKGLRKAGAAVTMESDEWFSQALPANTS 
AQKAELIALTQAIRWGKDINVNTDSRYAFATVH 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (ApAlanine OCysteine, D=Aspartic Add, 
E^GIutamic Add, ^Phenylalanine, G=Glydne, H=Hhtidine, 
f=>l5oleucine, K^Lysine, L=Leadne, M^Metbionine, 
N=A«paragine, P*Protine, Q=Gtutamine, R^Arginine, S=Serine, 
^Threonine, V-Valine, W=Tryptophan, Y^yrosine, 
X=Unknown, *=Stop codon, /=possible nudeotide deletion, 
V=possible nudeotide insertion 










VRGAICX^ERRLLTSAEKAIKNKNPPSSKPNRSSSW 

WGTTOX^VNAKQGPKPSPGHRIJUWLroEKWEI 

DFIXVKPHQAGYKYLLVLVDTFSGWTEAFATK 

NETVNMVVKHXNEIIPRHGLPVAIGSDNGPAFA 

LSIV*SVSKALNIQWKLHCAYRPQSSGQVERMNC 

TLKNTLTKLILETGVNWVSLLPLALLRVRCTPYW 

AGFLPFEIMYGRVLPILPKLRDAQLAKISQTNLLQ 

YLQSP 


3308 


A 


490 


1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTOSPFQRDF 

MEQRRFSDEtFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DWYHDYFYTINRYTLTRVAKNKSRLRVSTELRY 

RKQPWGLVKTFEEKNFWSGLEDYFRHL 


3309 


A 


490 


1077 


NSPSLDFM5NEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRV1PYTITLTNP 

LEHKTATVRETQTN1YKASQESECYV1DAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3310 

/ 


A 


2 


1198 


SPLCHPGLSRER/S*SEAKLRSGRYC*KRQVEAPL 

*RPGL*TMAASDTERDGLAPEKTSPDRDKKKEQS 

EVSVSPRASKHHYSRSRSRSRERKRKSDNEGRKH 

RSRSRSKEGRRHESKDKSSKKHKSEEHNDKEHSS 

DKGRERLNSSENGEDRHKRKERKSSRGRSHSRS 

RSRERRHRSRSRERKKSRSRSRERKKSRSRSRER 

KKSRSRSRERKRRIRSRSRSRSRHRHRTRSRSRTR 

SRSRDRKKRIEKPRRFSRSLSRTPSPPPFRGRNTA 

MDAQEALARRLERAKKLQEQREKEMVEKQKQQ 

EIAAAAAATGGSVLNVAALLASGTQVTPQI AMA 

AQMAALQ/u" * LAJTG": ~v vVNPLf7.T 

Z ^Q^KKRK^XWQGKKEGDkCoSA^i MGKN 


33 J ; |A 


177 


4 


PIQBPPRITPPRPSPHLLTPRTGSS. p ; PPRAPSPPHPT 
PGPAHDFPPLSAVLSGHTKT 


3312 




3 


426 


LESPRH*PPCWGPLIWALTVSSVPSPi i\5J SCILKS 
P/RPACPV/PGLWPSLLSPAPPQSSGPLLGLSPCPG 
AGQWPSPLSPAPPPSSDPLSGLSPCPGAGPRSSP\S 
ASAPCRAWLSPRRLTWPPHLQVGILIPTGRPWK 
NL 


3313 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
IAP\CTPAWVTQRDFFRKKK 


3314 


A 


162 


2 


QLQNLASRGCL*SQLlJtRLRRENW*NPGGGGCSE 
IAP\CTPAWVTQRDFFRKKK 


3315 


A 


466 


1 


PRKRESWWGERLP/PRGFPPAAEDAPAPGWKGR 
KHASRTARAHVFHPIRQSIRSPVRGRPGDPRAAH 
TRSAGTRLQCKASRGG*GKGPAPTR*EGGPGSAP 
APLPASSGCSLFPDSSPWTPPPPAPGAAAAQP**T 
PRCPAALRAGAHIGRVGRPY 


3316 


A 


3 


2307 


NHLGTLMQNWDSSSRVPFSSGQHSTQSFPPSLMS 

KSNSMLQKPTVAYVRPMDGQESMEPKLSSEHYSS 

QSHGNSMTELKPSSKAHLTKLOPSQPLDASASG 

DVSCVDEILKEMTHSWPPPLTAIHTPCKTEPSKFP 

FPTKESQQSNFGTGEQKRYNPSKTSNGHQSKSM 

IJKDDLKLSSSEDSDGEQDCDKTMPRSTPGSNSEP 

SHHNSEGADNSRDDSSSHSGSESSSGSDSESESSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residae of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanine OCysttlne, D=Aspartk Add, 
E=Gtutamle Add, ^Phenylalanine, OGIydnt, H»Hktidine, 
I«Isoleudne, K-Lysine, L«=Leudne, M=Mcthionine, 
N=Asparagine, F»Proline, Q^GIutamine, R^Arghiine, S=Serine, 
T«Threonlne, V-Valine, W«Tryptophan, Y=Tyrosine, 
X«=Unknown, *=Stop codon, Impossible nudeotide deletion, 
V-possible nudeotide Insertion 










SDSEANEPSQSASPEPEPPPTNKWQIJ3NWLNKV 

NPHKVSPASSVDSNIPSSQGYKKEGREQGTGfNSY 

TDTSGPKETSSATPGR\APKPIQKGSESGRGRQKS 

PAQSDSTTQRRTVGKKQPKKAEKAAAEEPRGGL 

KIESETPVDLASSMPSSRHKAATKGSRKPNIKKES 

KSSPRPTAEKKKYKSTSKSSQKSRE1JETDTSSSDS 

DESESLPPSSQTPKYPESNRTPVKPSSVEEEDSFFR 

QRMFSPMEEKELLSPLSEPDDRYPLIVKIDLNLLT 

RIPGKPYKETEPPKGEKKNWEKHTREAQKQASE 

KVSNKGKRKHKNEDDNRASESKKPKTEDKNSA 

GHKPSSNRESSKQSAAKEKDLLPSPAGPVPSKDP 

KTEHGSRKRTISQSSSLKSSSNSNKETSGSSKNSS 

STSKQKKTEGKTSSSSKEVKVKAPSSSSNCPPSAP 

TLDSSKPRRTKLWDDRNYSADHYLQEAKKLKH 

NADALSDRFEKAVYYLDAWSFIECGNALEKNA 

QESKSPFPMYSETVDLI 


3317 


A 


496 


2 


NLLQDEKLVHSYPYDWRTQETCGYIVPARQWFI 

N\TRDIKTAAKELLKKVKJTTOSAU^GMVEMMD 

RRPYWCISRQRVWGWIPVFHHKTKDEYLINSQT 

TBfflVKLYEQHGSDIWWTLPPEQLLPKEVLSEVG 

GPDALEYVPGQDILDIWFDSGTSWSYVLPGPD 


3318 


A 


2 


512 


AWHEGDSRSDQCHHPYNYGFDYYYGMPFTLVD 

SCWPDPSRNTELAFESQLWLCVQLVAIAILTLTF 

GKLSGWVSVPWLLIr^MILFIFLLGYAWFSSHTSP 

LYWDCLLMRGHEITEQPMKAE\RAGSIMVKEAIF 

LFRKGHSKGKLFLLFFLPFLQVHKTFPTTDGFHW 

AP 


3319 
T320~~ 


A 


407 


1 


SSLHRSPRPASPLPVPEAPVSFLPVPAPKPSALPPFS 
LSGAPSSASTFSPHSSPSPASPTPAPSPQSPFPSRPT 
SPPSLTPTRRPPLPADRRGPHLLYQPLHAPLEAAA 

TGP2/PS ',V4 ORLPRPRI '? ' W * YI AS?. 




4037 


343'^ 


QMSEaVAEKMLQYRRDT AGWKlcr" : ^ GVSVb 
WRPSVEFPGNLYRGEGIWGTLEEVWDCVKPAV 
GG1JIVKWDENVTGFEUQSITDTLCVSRTSTPSAA 
KdKLISPRDFVDLVLVKRYEDGTISSNAraVEHPL 
CTPKPGFVRGFNHPCGCFCEPO^EPTKTNLVTFF 
HTDI^GYLPQNVVDSFFPRSMTRFYANLQKAVK 


3321 


A 


37 


360 


SHSASGAGRPAAPAADLRPAPNGQRPGPRLGAR 
ALWLPPRGRPDEAGRLPGEHLPQVPWDPGLTRS 
PSPRGPCRGAARAGHVGETPAPWGCPPPCAWEH 
KGPGSEGTP 


3322 


A 


1 


420 


AIVEDKHSGRSYDITSDLGNVLTSTSIAKTVNG*A 

ESSDSGAESDEEDAQEDLMGAYHSDEDKKMMKI 

VADHKNI^IVTNGYDKIXjFVHDIQNDIHASSSL 

NGRSTVHVKPIDENLCjQTGKSAVCIHQDINDDH 

VEDVT 


3323 


A 


8 


459 


DTLSLNCIXPETLPMTPSF*LSFL*FPGLARAKSIP 

tktysnevvtlwyrppdillgstdystqidmw*g 
qvevwqgpcgkggglvttatqpaaflftvpslp 
rgvgcifyematgrplitgstveeqij1fifrilse 

liA W ALCA V B 1 HR 


3324 


A 


1276 


466 


PGSTHASARITIY*L*1ILSNATEVDNNFSKPPPFFP 
AGAPPASSSSSSSSSSPPTVSTAPPUPPPGFPPPPG 
APPPSLIPTIESGHSSGYDSRSARAFPYGNVAFPH 
LPGSAPSWPSLVDTSKQWDYYARSSSSSSSSSSSS 
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SEQJD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanlne OCystdne, D=Aspartie Add, 
&*G1utamie Add, ^Phenylalanine, G^GIydne, H=Hlstidine, 
Msolendne, K«=Lysint, I^-Leudne, M»Metnionine, 
N=Asparaginc, P*=Proline, Q=Glntnmine, R-Arginine, S=Scrine, 
T^Tbreonine, V<=VaIine, W«Tryptophan, Y=Tyrosine, 
X«Unknown, *=StDp codon,/=possib]e nudcotide deletion, 
\=possibIe nudcotide Insertion 










SSSPRDRDRER* RTRERERERDHSPTPS VFNSDEE 
RYRYREYAERGYERHRASREKEERHRERRHREK 
EETRHKSSRSNSRRRHESEEGDSHRRHKHKKSKR 
SKEGKEAGSEPAPEQESTEATPAE 


3325 


A 


266 


3312 


TCLFSASCSSLPSPSSSFALLSTENTQRTYRVNPD 

GSLRVTFASGMEIGI^SEPHIIAGAVNPTLGKCNI 

SLPGEHNANLISVL**GEQGCA*NVFHISFS*AHN 

RNLLSmFDfflTRTGKJYDDHRKFTLRILYDQTGR 

PILWSPVSRYNEVNITYSPSGLVTFIQRGTWNEK 

MEYDQSFL*SPQL*LSnCYSAFVSFQSVMLLLHS 

QRRYIFEYIXJPIXXI^VTMPSMVRHSLQTNl^V 

GYYRNIYTPPDSSTSFIQDYSRDGRLLQTLHLGTG 

RRVLYKYTKQARLSEVLYDTTQVTLTYEESSGD 

I^DSSTL1A*LLTVFVLVPAGPLIGRQIFRFSEEGL 

VNARFDYSYNNFRVTSMQAVINETPLPIDLYRYV 

DVSGRTBQFGKFSVINYDLNQVITTTVMKHTKIF 

SANGQVBBVQYEILKAIAYWMTIQYDNVGRMVI 

CDIRVGVDANITRYFYEYDAIXjQLQTVSVNDKT 

QWRYSYDLNGNINLLSHGKSARLTPLRYDLRDRI 

TRLGEIQYKMDEEXjFLRQRGNDIFEYNSNGLLQ 

K^YNKASGWTVQYYYDGLGRRVASKSSLGQHL 

QFFYADLTNPIRVTHLYNHTSSEITSLYYDLQGH 

UAMELSSGEEYYVACDNTGTPLAVFSSRGQVIK 

EILYTPYGDIYHDTYPDFQV1IGFHGGLYDFLTKL 

VHLGQRDYDVVAGRWTTPNHHIWKQLNLLPKP 

PT^TKLDCYGIFHFLFLILCLTDIRSWLELFGFQL 

HNVLPGFPKPELENSPSI* qmsnsmlhllcasls * 

TILGIQCELQKQLRNFISLDQLPMTPRYNDGRCLE 

GGKQPRFAAVPSVFGKGDCFAIKDGIVTADIIGVA 

NEDSRRLAAILNNAHYLENLHFTIEGRDTHYFIK 

r .GSLFF 1 *: 7 -:^^. T GNTOOR?JT HNGVNVTV3QMTSV * 

LNGH^uvT.U "QLQHGALCrTsmYGTTVEEEKld 

VLEIAKO :IAVAQAWTKEQRRLQEGEEGIRAWTE 

GEKQQLLSTGRVQGYDGYFVLSVEQ 


3326 


A 


290 


1041 


KACLHIl^SFirSNFLFNPLLPDSLYSVEARSQRA 

NIXjPCRRKRIXJTLMRLAAGFQYSSHKDPSLSAK 

EKHTDYHNEARGPWPGWVG*RTADGSCGRGPD 

GAHHPGPKSSSWRASRLLPGLGGSHHLDAYVGR 

DLECGTPAPLQLEEPPQPRGHPAPIPTGQAGPRDS 

GPGASP*VETRPLTDGRR*PGVRPVGWTPAHPAG 

TLRPRGAVEPSVSACGKWAPSPTSQGCCEGRCD 

AVPKHRAWRTPLCSQ 


3327 


A 


1 


418 


CSECGKSFCKKSKFIIHQRTHTGEKPYECNQCGK 

SFCQKGTLTVHQRTHTGEKPYECNECGKNFYQK 

LHLIQHQRTHSGEKPYECSYCGKSFCQKTHLTQH 

QRTHSGERPYVCHDCGKTFSQKSALlSnDHQKIrrr 

GVKLY 


3328 


A 


1 


270 


VTRKLPIF1VDAFTARAFRGSPAADCLLENELDED 
MHQKIAREMNLSETAFIRKLHPTDNFAQRSCFGL 
IWFTPTTDLQILTSSILPSIL 


3329 


A 


45 


419 


EELSCWQIWQQIANDLTRCQDSNIINNSQCHKQG 
DFPYQVGTELSIQISEDENYIVKKADGPNNTGNP 
ErTILRTQDSWRKTFLTESQRLNRDQQISIK>IKLC 
QCKKGVDPIGWISHHDGHRVHKR 


3330 


A 


64 


430 


FWRNFTGLAPAAAVATTTSSSTMRFTSISNSLTST 
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SEQID 
NO: 


Method 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
Dentide 
sequence 


Amino acid sequence (A^Alnnine OCysteine, D=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, (^Glycine, H=Hbtidine, 
JNIsoleudne, K=Lysine, L«Leudne, M=Methionjne, 
N^Asparagine, P«=Proiine, Q=Glutamine» R°Argtntae, S=Serine, 
T=Threonine, V-Vallne, W=Tryptophan, Y~Tyrosine, 
X=Unknown, *=Stop eodon, A=possible nudeotide deletion, 
V=possible nudeotide insertion 










AAIGLSFTTSiiriAlti lNll'i lilSGFTVNQNQ 
II^RGFENLWYTSTVSVVTTPVMTYGHLEGLIN 
EGNLELEIKRKLSSQATQ 


3331 


A 


3 


407 


TFGCSCTDCFFQKCCPABAGVLLAYNKNQQIKIP 
PGTPIYECNSRCC^GPIX^NRIVQKGTQYSLCIFR 
TSNGRGWGVKTLVKDCRMSFVMEYVGEVITSEE 
AERRGQFYDNKGITYIJTDLDYESDEFTVDAARY 


3332 


A 


25 


461 


PAADFVLQARPTRADILGIHSKYDEVRKAGACFY 

KMTGLGPGPQALYNGEPFKHEEMNIKELKMAVL 

QRMMDASVYLQREVFLGTLNDRTNAIDFLMDR 

NNVWRINTLILRTNQQYLNLLSTSVTADAEDFS 

TFFFLDSQDKSA 


3333 


A 


317 


54 


AWIIFLPPLTSCPLWAPGTKHKTILEARSGLGPIK 
AYPRLGPPTPGEPEAPAQDRTFHCEICNVKVNSK 
VQLKQfflSSRRHBIVDPV 


3334 


A 


304 


410 


AGPSLPSNUfcQIFQSLPPFMDII^ 


3335 


A 


19 


418 


VESRNSRVQPRVRLNDRTNATDFLMDRNNWPRI 
NTLEJlTNQQYLNLISTSVTADVEDr^TFFFLDSQ 
DKSAVIAKNMYYLTQDDESnSAATLWIIADFDK 
PSGRKLLFNALKHMITSVHSRVGI1YNPFF 


3336 


A 


1 


1003 


PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLLNR 

VLERLAGGATRDSAASDELLDDIVLTHSLFLPTEK 

FLQELHQYFVRAGGMEGPEGLGRKQACLAMLL 

HFLDTYQGLLQEEEGAGHIIKDLYLLIMKDESLY 

QGLREDTLRLHQLVETVELKIPEENQPPSKQVKP 

LFRHFRRIDSCLQTRVAFRGSDEIFCRVYMPDHS 

YVTIRSRLSASVQDILGSVIEKLQYSEEPAGREDS 

LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFAC 

TRDSYEALVPLPEEIQVSPGDTEIHRVEPEDVANH 

LTAFHWEIJTIO/HELEFVDYVFHGE 


3337 




444 


43 


KTLJ -CL ANQFPiii 5 cF/.LP. . 'JVA- JLL- ... . DEA L/ 
GFI&AC&LAG^ 

LVNKYCQAAHKLMVAVSEDVLQV^ ADWQRWL 
FGELPLCYFARVFDVFLVEGYKVLYRVA ,AXXF 


3338 


A 


1 


398 


reGKVRGRSAEMPGSDTALTVDRTYSDPur<HHR 
CKSRVERHDMNTLSLPLNIRRGGSDTNLNFDVPD 
GILDFHKVKLTADSLKQKILKVTEQIKIEQTSRDG 
NVAEY1JCLVNNADKQQAGRIKQVFEKKNQK 


3339 


A 


1 


665 


AAAASNWGLITNIWSWGVSVLTMPFCFKQCGI 

VLGALLLVFCSAVMTHQSCMFLVKSASLSKRRTY 

AGLAFHAYGKAGKMLVETSMIGLMLGTCIAFYV 

VIGDLGSNFFARLFGFQVGGTFRMFLLFAVSLCI 

VIJI^LQRNMMASIQSFSAMALLFYTVFMFVrvnL 

SSLKHGLFSGQWLRRVSYVRWEGVFRCIPIFGMS 

FACXJSQVLPTYDSLDEPSV 


3340 


A 


198 


367 


LLPLQVLQEAFSRCVAVLTRSSKPSDMSVQVCG 
YISKCYSVAAQFEECREKITEMP 


3341 


A 


562 


277 


HSV1KRTPRKY1AEIVLIDDFSNKEHLKEKLDEYI 
KLWNGLVKVFRNERREGLIQARSIGAQKAKLGQ 
VLIYLDAHCEVAVNWYAPLVAPISKDR 


3342 


A 


385 


2 


NLTWWPLFRDVSFYIVDLEVILIIFFLDNVIMWWE 
SIJLLLTAYFCYVVFMKF1WQVEKWVKQMINRN 
KWKVTAPEAQAKPSAARDKDEPTLPAKPRLQR 
GGSSASlJINSOvlRNSIFQNKIHTLDPHV 


3343 


A 


1 


385 


FRVDNSEEWKDVFnSSERSFKLDSLKCGTWYKV 
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PCMJS01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystdne, D»Aspartic Add, 
E«Glutamic Add, F=Phenylalanlne, G=Grycine, ENHlstidine, 
I^Isoleudne, K>=Lysine, L=Lcodne, M=MethioniDt, 
N=Asparaglne, P*Proline, Q=QntamiDC, R^Arglnint, S=Serine, 
T»Threonine, V-Valinc, W=Tryptophan, Y^Tyrosine, 
X^Unknown, *=Stop codon, /^possible nudeotide ddetion, 
V=possible nudeotide insertion 










KLAAKNSVGSGRISEnEAKTHGREPSFSKDQHLF 
THINSTHARLNLQGWNNGGCPITAIVLEYRPKGT 
WAWQGLRANSSGEVFLTELREATWY 


3344 


A 


351 


147 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3345 


A 


351 


147 


SPACITSSLSQHIADPRAAPTBVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3346 


A 


3 


1509 


AGIRHEAPPTTSNRHRRQIDRGVTHLNISGIJCMP 

RGIAIDWVAGNVYWTDSGRDVIEVAQMKGENR 

KTLISGMIDEPHAIVVDPLRGTMYWSDWGNHPK 

IETAAMDGTLRETLVQDNIQWPTGLAVDYHNER 

LYWADAKLSVIGSIRLNGTDPIVAADSKRGLSHP 

FSIDVFEDYIYGVTYINNRVFKIHKFGHSPLVl^LT 

GGLSHASDWLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPPPD 

APRPGTCNLQCFNGGSCFLNARRQPKCRCQPRY 

TGDKCE1JXJCWEHCRNGGTCAASPSGMPTCRCP 

TGFTGPKCTQQVCAGYCANNSTCTVNQGNQPQ 

CRCLPGFLGDRCQYRQCSGYCENFGTCQMAAD 

GSRQCRCTAYFEGSRCEVNKCSRCLEGACWNK 

QSGDVTOTCTIXjRVAPSCLTC^GHCSNGGSCTM 

NSKMMPECQCPPHMTGPRCEEHVFSQQQPGHIA 

SBLIP 


3347 


A 


974 


666 


SPEMESHPITQAGVQWHHLSSLQPLPPGFK*FSCF 
SLPE*LGYRHVPPCLANSVFSVEMG\FLHVGQAG 
LELLTSGDLPALASQSAGITG\SHRARPENGFENIF 


3348 


A 


1 


1171 


LSK1TMPVICNEPLSFIQRLTEYM*HTYFIHRPSSL 
SDPVDRMQCVAAFAVSAVASQWERTGKPFNPLL 
GETYELVRDDLGFRLISEQVSHHPPISAFHAEGLN 
NDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLEH 

k'JBAYTW^l 2 TCCVi^W*T^yi ^VIEQYGI « V BTINMf 

ktgdkcvl:^:^ 

KLCALY^>KWTECLYSVDPA1TOAYKKM)KKNT 

EEKKNSKQM^TSEELDEMPVPDSESAOTIPGSVLL 

WRIAPRPPNSAQMYNFTSFAMVLNEVDKDMESV 

IPKTIX:RLRPDniAMENGEIDQASEEKKRLEEKQ 

RAARKNRSKSEEDWKTRWFHQGPNPYNGAQD 

WIYSGSYWDRNYFNLPDIY 


3349 


A 


403 


497 


NFASSSGKYLRTQKIKCLNNKFTPrTrTEKK*SQS 
VTO > P*SNRIY*ILQS*NISFS*LPN*NFASSSGKYLR 
TQK1KCLNNKFTPFPTTEKK 


3350 


A 


1 


712 


GAPAQDCICLPFPFHSSr^ESDIRKPARRKI(7rTNP 

DFLLLLFMSVPWSAPPFCPPAEGSRDGRPKASV 

ARPAAVHEHHSPRDCGHLPDVIRSSLGGWQPH*P 

AQPENRLL*LLPVE*GHQHPTVSPVP*AGSPGGAS 

GWPGPGQAWRVRVPGPHPLCPPASPPSPVQQ**E 

SVAAGSGLPGCVLCAAGRRPGPLPLLCVEVGQA 

LPPGAWVSSSGQRPGLTHPLAYSHGCVPSEG 


3351 


A 


1 


428 


MAAWAATALKGRGARNARVLRGILAGATANK 

ASHNRTRALQSHSSPEGKEEPEPLSPELEYIPRKR 

GKNPMKAVGIJVwAIGFPCGllXFl^^ 

VKQMKARQNMRLSNTGEYESQRFRASSQSAPSP 

DVGSGVQT 


3352 


A 


2 


841 


RTLFRGRRRREDDRISRPHPSTAESKAPTPKFDLL 
ASNFPPl^GSSSRMPGELVLENRMSDVVKGVYK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanln« OCystdne, D^Aspartic Add, 
&=Glutamic Add, ^Phenylalanine, G^Gtydnc, H^BQstidlne, 
I=Isoleudne, K^Lysine, L=Leucine, MNMethionine, 
N=Asparaglne, P t =Prollnt, Q=Glutamlne, R^Argfnine, S=*Serine, 
T^Threonine, V«Valine, W^Tryptophan, Y^Tyrosine, 
X==Unknown, *=Stop codon, /^possible nudeotlde ddetioo, 
V=possible nudeotide insertion 










EKDNEELTISCPVPADEQTECTSAQQLNMSTSSP 

CAAELTALSTTQQEKDLIEDSSVQKDGLNQTTIP 

VSPPSTTKPSRASTASPCNNNINAATAVALQEPR 

KLSYAEVCQKPPKEPSSVLVQPLRELRSNWSPT 

KNEDNGAPENSVEKPHEKPEARASKDYSGFRGN 

HPRGAAGKIREQRRQFSHRAIPQGVTRRNGKEQ 

YVPPRSPK 


3353 


A 


1054 


587 


IATPTWTAPLTATPTPAHQYGPARVPNGAPRLEP 

PPGKRECRVGQYVVDLTSFEQLAXPVLRNADCS 

SGPGQRVCVIDEIGKMELFSQLFIQAVRQTLSTPG 

TIILGTIP\rPKGKPIALVEEIRNRKDW 

NRNHLLPDIVTCVQSSRK 


3354 


A 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVI^I^BVVER 

VLTFI^AKALLRVACVCRLWRECVRRVLRTHRS 

VTMSAGLAEAGHLEGHCLVRWAEELENVRILP 

HTVLYMADSETFISIJEECRGHKRARKRTSMETA 

lAIXKLFPKQCQVLGrVTPGIVVTPMGSGSNRPQ 

EIEIGESGFALLFPQIEGIKIQPFHFIKDPKNLTLER 

HQLTEVGLLDNPELRWLVFGYNCCKVGASNYL 

QQWSTFSDMNIILAGGQVDNLSSLTSEKNPLDI 

DASGWGLSFSGHRIQSATVLLNEDVSDEKTAEA 

AMQRLKAANIPEHNTIGFMFACVGRGFQYYRAK 

GNVEADAFRKFFPSVPLFGFFGNGEIGCDRIVTG 

NFDJUCCNEVKDDDLFHSYTTTMALIHLGSSK 


3355 


A 


1 


707 


GTSSGLGGDRLAAPGPSPPSFYPQGRGERAYD1Y 

SRLLRERJVCVMGPBDDSVASLVIAQLLFLQSESN 

KKPIHMYINSPGGVVTAGLAIYDTMQYILNPICT 

WCVGQAASMGSLLLAAGTPGMRHSLPNSRIMIH 

QPSGGARGQATDIAIQAEEIMKIiCKQLYNIYAKH 

TKQSLQVIESAMERDRYMSPMEAQEFG1LDKVL 

VUFPQPGEDEPTLVQKEP \ 5 APA \£^A27 f 


3356 


A 


3J2^ 


338 


FNYHFCRNLE ■ )ir Si^V*PGMCGLLAKHLSFI>; ; V ti 
AFLJT/lX5VAAU^AVA*PRKKAYADFYRNYi' v 
DCEFEVRKANISQSTK 


3357 


A 


1 


403 


A1X5SCGGLLGTGLLKGTMSGTLWSKGIFAGYKR 
RIRIQREHTAV1JCIEG\VYARDETEFYLRMICANV 
YKANNNTVTPVLTPDKTOVMWRKVTQAHGISI 
MVRAQFRTNIJ>ADAIGHRIRMML*PSRMYTIEPS 


3358 


A 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHA 

VMDSERQVKDTDDEESPKRSIRDSGYIDCWDSER 

SDSLSPPRHGRDDSFDSLDSFGSRSRQTPSPDWL 

RGSSDGRGSDSESDLPHRKLPDVKKDDMSARRT 

SHGEPKSAWFNQYLPNKSNQTAYVPAPLRKKK 

AEREEYRKSWSTATSPAGLGKKALQDYGPR1\PV 

SVDDAESTSMFDMRCEEEAAVQPHSRARQEQLQ 

LINNQLREEDDKWQDDLARWKSRKRSVSQDLIK 

KEEERKKMEKLLAGEDGTSERRKSDCTYREIVQE 

KERRERELHEAYKNARSQEEAEGILQQYIERFTIS 

EAVLERLEMPKILERSHSTEPNLSSFLNDPNPMK 

YLRC^SLPPPKFTATVETTIARASVLDTSMSAGS 

GSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 

VDGKVSVNGETVHREEEKERECPTVAPAHSLTK 

SQMFEGVARVHGSPLEI^QDNGSIHMKKPNSV 

PQELAATTEXTEPNSQEDKNIXjGKSRKGNIELAS 

SEPQHFTTTVTRCSPTVAFVEFPSSPQLKNDVSEE 
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SEQW 
NO: 


Method 


Predicted 
beginning 
nucleotide 

IUvBUDO 

corresponding 
to first amtno 
add residue of 
peptide 
sequence 


Predicted end 

nadeotide 

location 

curies ponuJ Dg 

to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanine OCystdne, D=Aspartic Add, 
E-Glntamic Add, F=Phenylalanine, G=G!ydne, H-HIstidtoc, 
Msoieudne, K=Lysine, IHLeadne, M«Methionine, 
N^Asparagine, P=Proline, Q^Glotamine, R=Arginine, $=Serine» 
T-Threonine, V»Valine, W-Tryptophan, Y-Tyrosiae, 
X-Unknown, *«Stop codon, /==pasaWe nudeotide ddetion, 
\=possible nadeodde insertion 










KIXJKKPENEMSGKVELVLSQKVVKPKSPEPEAT 

LTFPFLDKMPEANQLHLPmNSQVDSPSSEKSPV 

TTPFKFWAWDPEEERRRQEKWQQEQERLLQER 

YQ\KEQDK\LKEE\WEKAQKEVEEEERRYYEEEP* 

mEDPVVPFTVSSSSAlXJI^TSSSMTEGSGTMNKI 

DLGNCQDEKQDRRWKKSFQGDDSDLLLKTRES 

uj\ L» c,r» IS.vJa.Li X JivJ/\J-*/\xlovJlNr V oJVlJ V rjrA Jri\l LiXJl 

EAGAPHCGTNPQLAQDPSQNQQTSNPTHSSEDV 

KPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 

PLGKGAAMIIETLNLYFHIQCFRCGMCKGQLGDA 

" >^VJ XXJy JVUvlN vJJ_/J^rl>( Vv-XYJL'O I Wli\j&ISJ2in\J\Jjr X I Lj 


3359 


A 


3 


368 


EVTASREGRGACAWECGSSRGPWGLLRGTFAPV" 

RAATP*S*LPKGSLRHRP*/CPPPVHLPPKSSCPPR 

AWAGRATSM*TSSYSSEYQPQTP*ALVTLPPRSY 

YT 1 TRT T TT TTJT Wlinn FED 


3360 


A 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAWVRKAADYV 
RSKDFRDYmSTHFWGPVANWGLPIAAITDMK\ 
KSranSRRMTFAL*CYSLTFVRPAHYVQ\PWNWL 
MLGCHTAVDFDQLISSMPCISHGMTASASAL 


3361 


A 


4619 


532 


LLLGRANSPPYNSWRTLPPATLLLRRAGWESF 

WSCQSRSPWPPRPEVRAPAKGPRGVAGAAGACS 

AGARLGDAAGGDPASGQAARGCGARAPRGLGR 

TARARDTAMEDAGAAGPGPEPEPEPEPEPEPAPE 

PEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID 

DLAQQYADYYNTCFSDVCERMEELRKRRVSQD 

LEVEKPDASPTSLQLRSQIEESLGFCSAVSTPEVE 

RKNPLHKSNSEDSSVGKGDWKKKNKYFWQNFR 

KNQKG1MRQTSKGEDVGYVASEITMSDEERIQL 

MMMVKEKMTTIEEALARLKEYEAQHRQSAALDP 

ADWPDGSYPTFDGSSNCNSREQSDDETEESVKF 

K^LHKLVNSTR ^ TtXKI £ 1RVEEMKKP\ST ; * ;> ;EF g 

HVTE^SPVUDERS.iLY3GVII : I ?JFFDQ$?EKVP& 

EDDSDSLTTSPSSSSLDTWGAuRKLVKTFSKGES 

RGLIKPPKKMGTFFSYPEEEKAQKVSRSLTEGEM 

KKGLGSLSHGRTCSFGGFDLTNRSLHVGSNNSDP 

MGKEGDFVYKEVIKSPTASRISIX5KKVKSVKBT 

MRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQPD 

PEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVS 

TTDSSTSNRESVKSELXjDDEEPPYRGPFCGRARV 

HTDFIPSPYDTDSOOJCKGDnDJISKPPMGTWMG 

LlJNHSIKVGTF>nTYVDVLSED\EEKPKRP^ 

GRPPQPKSVED1XDRINLKEHMPTFLFNGYEDLD 

TFKLLEEEDLDELNIRDPEHRADLLTAVELLQEY 

DSNSDQSGSQEKLLVDSQGLSGCSPRDS'CYESS 

ENLENGKTRKASLLSAKSSTEPSLKAFSRNQLGN 

YPTLPLMKSGDALKQGQEEGRLGGGLAP\DTSKS 

CDPPGC*LVLN\KNRRKPPSFPSCRSC^ETL\EGPQ 

TVDTWPRSHSLDDLQVEPGAEQDVPTEVTEPPPQ 

IVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQS 

KRFSEPQKLTTKKLEGSIAASGRGLSPPQCLPRNY 

DAQPPGAKHG1ARTPLEGHRKGHEFEGTHHPLG 

TKEGVDAEQRMQPKIPSQPPPVPAKKSRERLANG 

LHPVPMGPSGALPSPDAPCLPVKRGSPASPTSPSD 

CPPALAPRPLSGQALGSPPSTRPPPWLSELPENTS 

LQEHGVKLGPALTR\KVSCARGVDLETLTENKL\ 



311 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/DS01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanlne OCystdne, D=>Asparttc Add, 
E=GIntamic Add, ^Phenylalanine, OGlydne, H=HistidJne, 
^Isotendne, K^Lysine, L^Lcucine, M=Methionine, 
N=»Asparagine, P^Prollne, Q^Glntamlne, R-Arginine, S=Serine, 
T»Threonine, V-Valine, W=Tryptophan, Y^TyrosIne, 
X=Unknown, *°Stop codon, /^possible nudeotide deletion, 
V=possible nucleotide insertion 










HAEGIRSSRREPYS*LRHGRCGI\F\EALVQRYAED 
LIXJPERDVAANMDQIRVKQLRKQHRMAIPSGGL 
TEICRKPVSPGCIS\SVSDWLISIGLPMYAGTLSTA 
GFSTL\SQVPSI^rn*CLQEAG\ITEERHIRK\LLSAA 
RLFXLPPGPEAM 


3362 


A 


1 


4653 


FRGGVGYAHTLHLLPFAGSSWLARARRTDRWT 

SGLVEMATLSLTVNSGDPPLGALLAVEHVKDDV 

SISVEEGKENILHVSEWIFTDVNSILRYLARVAT 

TAGLYGSNLMEHTEEDHWLEFSATKLSSCDSFTS 

TINELNHCLSLRTYLVGNSLSLADLCVWATLKG 

NAAWQEQLKQKKAPVHVKRWFGFLEAQQAFQS 

VGTKWDVSTTKARVAPEKKQDVGKFVELPGAE 

MGKVTVRFPPEASGYLfflGHAKAALLNQHYQV 

NFKGKLIMRFDDTNPEKEKEDFEKVILEDVAML 

HIKPIXJFTYTSDHFETIMKYAEKLIQEGKAYVDD 

TPGEQIKAEREQRIESKHRKNPIEKNLQMWEEMK 

KGSQFGHSCCLRAKIDMSSNNGCMRDPTLYRCK 

IQPHPRTGN*YW\AYPTYDFACPIVDSIEGVTHAL 

RTTEYHDRDEQFYW1IEALGIRKPYIWEYSRLNL 

NNTVLSKRKLTWFVNEGLVDGWDDPRFPTVRG 

VLRRGMTVEGLKQFIAAQGSSRSVVNMEWDKI 

WAFNKKVIDPVAPRYVALLKKEVIPVNVPEAQE 

EMKEVAKHPKNPEVGLKPVWYSPKVFIEGADAE 

TFSEGEMVTFINWGNLNITKIHKNADGKnSLDAK 

LNLENKDYKKTTKVTWLAETTHALPIPVICVTYE 

HLITKPVLGKDEDFKQYVNKNSKHEELMLGDPC 

LKDLKKGDIIQLQRRGFFICDQPYEPVSPYSCKEA 

PCVLIYIPDGHTKEMPTSGSKEKTKVEATKNETS 

APFKERPTPSLNNNCTTSEDSLVLYNRVAVQGD 

WRELKAKKAPKEDVDAAVKQLLSLKAEYKEK 

? ?£?YKPC>^V^^ 

VriA^ jEYVRKlitAEKSPKAKINEAVECLL:^ > A 

QYKEKTGKEYIPGQPPLSQSSDSSPTRNSEPAGLE 

irEAKVLFDKVAS<^ENTSl<KIXTEKAPKDQVDI 

AVQELLQLKAQYKSUGVEYKPVSATGAHDKDK 

KKKEKENKSEKQNKPQKQNIX}QRKDPSKNQGG 

GLSSSGAGEGQGPKKQTEILGLEAKKXEENLADW 

YSQVITKSEMIEYHDISGCYIIJ^WAYAIWEAIKD 

FFDAEIKKLGVENCYFPMFVSQSALEKEKTHVA 

DFAPEVAWVTRSGKTELAEPIAIRPTSETVMYPA 

YAKWVQSHRDLPIKLNQWC^rVVRWEFKHPQPF 

LRTREFLWQEGHSAFATMEEAAEEVLQILDLYA 

QVYEELLAIPVYKGRKTEKEKFAGGDY ITl'lEAF 

ISASGRAIQGGTSHHLGQNFSKMFEIVFEDPKIPG 

EKQFAYQNSWGLTTRUGVMTMVHGDNMGLVL 

PPRVACVQVVIIPCGITNALSEEDKEAnAKChJDY 

RRRLLSVNIRVRADLRDNYSPGWKFNHWELKG 

VPIRLEVGPRDMKSCQFVAVRRDTGEKLTVAEN 

EAETKLQAIL^DIQVTLFTRASEDLKTHMVVANT 

MEDFQK1LDSGKWQIPFCGEIDCEDWIKKTTARD 

QDLEPGAPSMGAKSLCIPFKPLCELQPGAKCVCG 

KNPAKYYTLFGRSY 


3363 


A 


3797 


1514 


LGGAAPETMPFPVTTQGSQQTQPPQKHYGITSPIS 

IJ\APKETDCVLTQK\IJ0JETLKPFG 

SRRNFNFGKN*INLVKEWIRRNQ*KAKNLPQSVI\ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-Alanine OCystdne, D=«Aspartic Add, 
E-Glutamic Add, ^^Phenylalanine, OGtydne, H=HIstidine, 
I"»Isoieudne, K^LysIne, L^Leudne, M=Methionine, 
N-Asparagine, P^Prollne, Q=&ntamjne, R^Arglnine, S=Serine, 
T^Threonlne, V»Valine, W^Tryptophan, Y=Tyrosine ) 
X^Unknown, *=Stop codon, A=posdble nodeotide ddetton, 
V=possibIe nodeotide insertion 










ENV\GGKIFT/rl-GSYRiyGEVHTKGADIDGVCVF 

APRHVDRSDFFTVSFYDKLKLQEEVKDLRAVEEA 

FWVIKIXTIXjIEIDILFARLALQTIPEDLDLRDDS 

IIJCbai>IRCIRSLNGCRV^ 

LRAIKLWAKRHNTYSNILGFLGGVSWAMLVART 

CQLYPNAIASTLVHKFFLWSKWEWPNPVLLKQP 

EECNLNLPVWDPRVNPSDRYHLMPIITPAYPQQN 

STYNVSVSTRMVMVEEFKQGLAITDEJDLLSKAE 

W^KLFEAPT^QKYKHYIVLIASAPTENQR^ 

VGLVESKIRILVGSLEKNEFniAHVNPQSFPAPK 

ENPDKEEFRTMWVIGLVFKKTENSENLSVDLTY 

DIQSFTOTVYRQAINSKMFEVDMKIAAMHVKRK 

QLHQIXPNHVLQKKKKHSTEGVKLTALNDSSLD 

LSMDSDNSMSVPSPTSATKTSPLNSSGSSQGRNS 

PAPAVTAASVTNIQATEVSVPQVNSSESSGGTSSE 

SIPQTATQPAISPPPKPTVSRVVSSTOLVNPPPRSS 

GNAATSGNAATKIPTPIVG VKRTS SPHKEESPKK 

TKTEEDETSEDANCLALSGHDKTEAKEQLDTETS 

TTQSETTQTAASLLASQKTSSTDLSDIPALPANPIP 

VDCNSHCLRLNR 


3364 


A 


54 

... 


3073 


SARTMSYDYHQNWGRDGGPRSSGGGYGGGPAG 

GHGGNRGSGGGGGGGGGGRG/WQGPASRAPER 

PRNRHVVREKTGAEEQAVKRRGKRE17LVHMDE 

RREEQIVQLLNSVQAKNDKESEAQISWFAPEDHG 

YGTEVSTKNTPCSENKLDIQEKKLINQEKKMFRI 

RNRSYIDRDSEYLLQENEPDGTLDQKLLEDLQKK 

KNDLRYIEMQHFREKLPSYGMQKELVNLIDNHQ 

VTVISGETGCGKTTQVTQFILDNYffiRGKGSACRI 

VCTQPRRISAISVAERVAAERAESCGSGNSTGYQI 

RLQSRIJRKQGSILYCTTGIXLQWLQSDPYLSSVS 

I-T=/Lr2IHERNf r : 7*^VLM"Tv J VIirLL j TTJSDLKVT * 

LMSATLNA^5b X I' GNCPMIHIPGFTFP WEYLL 

EDVmmYVPliQKEHRCQFKRGFMQGHVNSQE 

KEEKEAIYKER^TDYVRELRRRYSASTVDVIEM 

MEDDKVDLNUVAIJ^/IVLEEEDGAILVFLPGW 

DMSTLffl)LLMSQVMFKSDKFUIPLHSLMPTW 

QTQWKRTPPGVRKIVL\TNIAETSITIDDVVYVID 

GGKKETHFDTQNNISTMSAEWVSKANAKQRKG 

RAG\RVQPG SLLFICING S * EASLLG WTIQLPEIF/R 

GTPLEELCLQIKVLRLGGI/GLFLSRLMDPPSNEA 

VLLSIRQL\RSLNALDKQEELTPLGVHLARLPVEP 

HIGKMILFGALFCCIJDPVLTIAASLSFKDPFVIPLG 

KEKIADARRKELAKDTRSDHLTVVNAFEGWEEA 

RRRGFRYEKDYCWEYFLSSNTLQMLHNMKGQF 

AEHLLGAGFVSSRNPKDPESNINSDNEKIIKAVIC 

AGLYPKVAKIRLMXjKKRKMVK\nrrKTDGLVA 

VHPKSVNVEQTDFHYNWLIYHLKMRTSSIYLYD 

CTEVSPYCLLFFGGDISIQKDNDQETTAVDEWIVF 

QSPARIAHLVKRAVVHMDERREEQIVQLLNSVQ 

AKNDKESEAQISWFAPEDHGYDKKYFFKE 


3365 


A 


439 


878 


ECCNVRPUIETDLLKMKRKPRASSPVVEEQPRA 

NTKETRKKKSFSQPMSASTKEESQDGRRKGK*L 

KGRARKKNAPQKSMALRBLEEGSRPTPSGHSDQL 

NEEL*QNELQLEQ/PEGT*LEQQSEGTQPEQQSGR 

MPTISTLSLSSE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A~Alanine OCystrine, D=Aspartic Add, 
EKJlutamlc Add, ^Phenylalanine, G^GIyclne, H=Hlstidlne, 
I=IsoIeudne, K=Lysine, LpLeudne, M=MethJonlne, 
N»Asparaglne, P^Proline, Q=Clntamine, R»Arginine, S=Serine, 
T«Threonine, V=Valine, W=Tryptophan, Y=0>roslne, 
X«=Unknown, *=Stop cod on, /-possible nudeotide deletion, 
\=po5sible nudeotide insertion 


3366 


A 


1 


827 


FRGYWGVREAFTDASWSGGLGPGKPGMKITRQ 

KHAKKHLGFFKNNFGVREPYQILLDGTFCQAAL 

RGRIQIJ<EQU 5 RYIJvlGETQLCTTRCVLKELETLG 

KDLYGAKLIAQKCQVRNCPHFKNAVSGSECLLS 

MVEEGNPHHYFVATQDQNLSVKVKKKPGVPLM 

FHQNTMVIJDKPSPKTIAFVKAVESG\RLSQCMRK 

KVSMSKRNRV**KTLNRGRRKKRKKISGPNPLS 

OJaCKKKAPDTQSSASEKKRKIU^^ 

LSEKQNAEGE 


3367 


A 


40 


1467 


MLWGCRAKACWGPRLSDLVASLSPQRECISVHV 

GQAGVQIGNACWELFCLEHGIQADGTFDAQASK 

INDDDSFTTFFSETGNGKHVPRAVMIDLEPTVVD 

EVRAGTYRQLFHPEQLTTGKED AANNY ARGHYT 

VGKESBDLVLDRIRKLTDACSGLQGFLIFHSFGGG 

TGSGITSIXMErll^LDYGKKSKLEFAIYPAPQVS 

TAVVEPYNSILTTHTTLEHSDCAFMVDNEAIYDI 

CRRl^mRPTYTNLNRLISQIVSSITASLRFDGAL 

NVDLTEFQTNLVPYFRIHFPLVTYAPIISAEKAYH 

EQLSVAEITSSCFEPNSQMVKCDPRHGKYMACC 

MLYRGDVWKDVNVAIAAKTKRTIQFVDWCPT 

GFKVGINYQPPTVWGGDLAKVQRAVCMl^NTT 

AIAEA WARLDHKFDLMYAKRAFVHWYV GEGM 

EEGEFS*RPGEDLA\ALE\KDYEEVGTDSFEEENE 

GEEF 


3368 

I. 


A 


3 


2597 


SLLEETMDEDSSLREYTVSLDSDMDDASKCLQE 

YDSGTGNTREALRPCPRTVSTKAQPGRSASSSSG 

DKTTSFAEQKIRKLNHTDGESSGSSSQKTTPEGSE 

LNIPHAGAWAQIPEETGLPQGRDTTQLLASEMV 

HLMMK\LKEKR\RAI*AQKKKMEAAFTKQRQKM 

GRTAFL TVVKKKGDGISPLREEAAGAEDEKVYT 

ORAKEKi^i < \ f C'TDGQTlSK "I «AL-v .iSMEN^AKV* 

eklnsslhflqqemqrlslq \ )emlmqmreqqs 

wvisppqpspqkqirdfkpskqag:;-ssaiapfssd\ 

spr\pthpsstsll>frksasfsvksq?/r?rpnelki 

tplmtltpprsvdsij*rijau?spsqwiqtosfvc 

fgddgepqlkeskpkeevkkeeleskgtleqrg 

hnpeekeikpfestvsevl^lpvtetvcltpnedq 

lnqptepppkpvfpptapknvnlffivslsdlkppe 

kadvpvekydgesdkeqfdddqkvccgfffkd 

dqkaendmamkraallekrlrrbketqlrkqq 

leaemehkkeetrrkteeerqkkederarrefir 

qeymrrkqlklmedmdtvdcprpqvvkqkkqr 

pksihrdhiespktpikgppvsslslaslntgdnes 

vhsgkrtprsesvegflspsrcgsrngekdwen 

asirssvasgteytgpklykepsaksnkhhqnal 

ahccij\gkvnegqkkkileemeksdannflilf 

rdsgcqfrslytycpeteeinkltcigpk^itkkm 

ieglykynsdrkqfshipaktlsasvdaitihshl 

wqtkrpvtpkkllptka 


3369 


A 


977 


594 


RGSGLTQEPGSVGQLALACAEGAVEWLYPAGAL 
RLTLGGPDPRARPGIACLRPVRPFAGAQVFAERA 
GGALELLLAEGPGPAGGRCVRWGPRERRALFLQ 
ATPHQDISRRVAAFRFELREDGRPEIAP 


3370 


A 


345 


1383 


DLSIJECTGFKETNLGVYFLSSKWVLRLYALHIID 
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SEQtD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A^AJanine OCystdne, D^Aspartic Arid, 
E=G]ntaraic Acid, ^Phenylalanine, G=Glydne, H=Hlstidine, 
Msoleucine, K-Lysine, L^Leudne, M=4W ethionine, 
N^Asparagine, P=Proline, Q=Giutaminc, R^Arginlne, $=$erine, 
T=Threonine, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=Un known, *=*Stop codon, /^possible nudeotide deletion, 
V=pos5ible nudeotide insertion 










YSAVIJFPC*AMDHI^FIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEE1GKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 

QEKKNQDRIJRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMTVWKHGLLI 


3371 


A 


345 


1383 


DI^LECTGFKETNLGVYFLSSKWVl^YALHIID 

YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 


3372 


A 


239 

■ 

■ 

- 


3348 


PMQNCMCSLTLSVLPLGPQPPVPEKRPPEIQHFR 
MSDDVHSIX}KVTSDIj^CRRKLTS\*GGLSEELGS 
ARRS GEVTLTKGDPGSLEE WETWGDDFSL YYD 
SYSVDERVDSDSKSEVEALTEQLSEEEEEEEEEEE 

eeeeeeeeeeeeedeesgnqsdrsgssgrrkakk 

kwrkdspwvkpsrkrrkrepprakeprgvngv 

gssgpseymevplgslelpsegtlspnhagvsnd 

tssletergfeelplcscrmeapkidriseraghk 

cmatesvdgelsgcnaailkretmrpssrvalm 

\ox:ethrarmvkhhccpgcgyfctagtflechp 

dfrvahrfhkacvsqlngmvfcphcgedasea 

qevtiprgdgvtppagtaapappplsqdvpgrad 

tsqpsarmrghgeprrppcdpladtidssgpsltl 

pncgclsavglplgpgi \7 ley al • tqe**;? -;:'y 

lrfeprql £'*..-"» ^qgelqkvii^1lij)nl: :-, nr 

ix^skrtplhaaaqkgsveioivllqaga>l;t>ia 

vdkqqrtplmeawnnhlevarymvqrggcv 

YSKEEIXjSTCIJIHAAIQGNLEMVSLLLSTGQVD 

VNAQDSGGWTPIIWAAEHKHIEVIRMLLTRGAD 

VTLTDNEENICLHWASFTGSAAIAEVLLNARCDL 

HAVNYHGDTPIjnAARESYHDCVLLFLSRGANP 

EIJINKEGDTAWDLTPERSDVWFALQLNRKLRL 

GVGNRAmTEKnCRDVARGYENVPIPCVNGVIXj 

EPCPEDYKYISENCETSTMNIDRNITHLQHCTCV 

DIX^SSNCUXKJLSIRCWYDKIXJRLLQEFNKIEP 

PLIFECNQACSCWRNCKNRWQSGIKVRLQLYR 

TAKMG WGVRALQTIPQGTFICEYV GELISDAEAD 

VREDDSYLFDLDNKIXjEVYCIDARYYGNISRFIN 

HLCDPNIIPVRVFMLHQDLRFPRIAFFSSRDIRTGE 

ELGFDYGDRFWDIKSKYFrCQCGSEKCKHSAEAI 

ALEQSRLARLDPHPELLPELGSLPPVNT 


3373 


A 


587 


1584 


PIXjRLWSCSEDKTIKIWDTTNKQCVNNFSD 

FANFVDFNPSGTCIASAGSIXJTVKVWDVRVNKL 

LQHYQVHSGGVNCISFHPSGNYLirASSDGTLKIL 

DLLKGRLIYTLQGHTGPVFTVSFSKGGELFASGG 

ADTQVIXWRTWDELHCKGLTKRNLKRLHFDSP 

PHIXDIYPRTPHPHEEKVETVEDFFLHLLRLIQSL 

R*SICRSLLPLLWISFLLEJPQQQKPWGLCQTRV 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

«ciu rcsiuuc ui 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A s Alanine OCysteJne, D=Aspartic Add, 
E=GIutamlc Add, ^Phenylalanine, OGtydnt, H=Hhtidlnc, 
I s Isoleudne, KpLyrine, L^Leudnc, M E3 Methlonine» 
N=-Asparagint, P=>Protine, Q-Glataraine, R-Arginine, S=Serine» 
^Threonine, V=Valine, W^Tryptophan, Y=Tyroslne, 
X^Unknown, **=Srop codon, ^possible nudcotide ddetfon, 

r^pUuJQlC UUCICUuUC luSCruUIl 










KRPVDIS*TU 5 *CHQNVCQQPRKRKQKT*VTSPV 
KVKA^SIPLAVTDALEHIMEQLNVLTQTVSILEQR 
LTLTEDKLKDCLENQQKLFSAVQQKS 


3374 


A 


398 


21 


WLYPMALSILDIKMSPSWYFHMAIGIINWNTTAG 
I^GTLYPKVPQKYILFDSVILLLGMLRKIRQVCQ 
NVYMKGCSPITLrTCIVHYWPGAVAHAYNPSTLG 
GQ VG/WQIT* GQEFETSLD YMVKPHL Y 


3375 


A 


3 


1051 


VPTQQnjy^PEQThTTKDWTVTPEHVLPESQSLLT 

FEEVAMYFSQEEWELLDPTQKALYNDVMQENY 

ETVISLALFVLPKPKVISCLEQGEEPWVQVSPEFK 

DSAGKSPTGLKLKNDTENHQPVSLSDLEZQASAG 

VISKKAKVKVPQKTAGKENHFDMHRVGKWHQ 

DFPVKKRKKI^TWKQELLKLMDRHKKDCAREK 

ITKCQECGKTrllVSS\DL\IKHQRIHTEEKPYKCQ 

QCDKRFRWSSDLNKHLTTHQGIKPYKCSWGGKS 

FSQNTNIJHIIQRTHTGEKPFTCHECGKKFSQNS 

HLIKHRRTHTGEQPYTCSIC^RKOTSRRSSIXRHQK 

LHL*REACPVSHFWKTF 


3376 


A 


137 


2329 


sfespaplpstcfpqerqdpgpcyvsgamaglgp 

gvgdseggprplfcrkgalrqkvvhevkshkft 

arffkqptfcshctofiwgigkqglqcqvcsfvv 

hrrchefvtfecpgagkgpqtddprnkhkfrlh 

syssptfcdhcgsllyglvhqgmkcsccemnvh 

rrcvrsvpslcgvdhterrgrlqleiraptadei 

hvtvgearnlipmdpnglsdpyvkiiclrpdprnl 

tkqktrtvkatlnpvwnetfvfnlkpgdverrl 

svevwdwdrtsrndfmgamsfgvsellkapvd 

gwykllnqeegeyynvpvadadncsllqkfea 

cnyplelyervrmgpssspipspspsptdpkrcffg 

aspgplhisdfsflmvlgkgsfgkvmlaerrgsd 

zlyaikilki^-.t;qddd\ DvTtl\vkrvt algg/ • 

kgpggrphflixjlhstfqtpdrly^ v tgg" 

dlmyhiqqlgkfkephaafyaaeiaiglfflhnq 

lIIYRDLKlJDNVMLDAEGHIKrro 
GTTTRTFCGTPDYIAPEHAYQPYGKSVDWWSFG 
VLLYEMI^GQPPFDGEDEEELFQAIMEQTVTYP 
KSLSREAVAICKGFLTKHPGEAPGASGP*WGNLT 
IRAHGFFPLGFDWERLERLVEIPASFSRPRPCGPQR 
RGIFDKFrTRAAPA\LTPPARLVLDSIDQADFQGF 
TYVNPDFVQPDARSPTSTVHVPVM 


3377 


A 


918 


738 


SSMLWGFSVFRRSWILNCWLSSSQVGISAACKFS 
TLTrTTHTHTHTHTRHAPFCGTCLYY 


3378 


A 


1126 


456 


FSKLIMKTFOGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSWSTDQESAEEIPILIIEG 

FIJLJ r NYKPIJDTIWNRSYFLTIPYEECKR^ 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEW 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 

RRNTTNPS/CK*IRKLQGVI 


3379 


A 


1126 


456 


FSKLIMKTFnGISGVTNSGKTTLAKNLQKHIJPNC 

SVISQDDFFKPESEEETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSWSTDQESAEEIPELIIEG 

FLIJ ? NYKPLDTIWNRSYFLTIPYEECKRRRSTRW 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YUXjTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D»Aspartie Acid, 
fcXilutamic Add, ^Phenylalanine, G=Gtydne, H»Histidine, 
fslsoleuctne, K=Lysine, JLHLeudne, M=MethM>nine, 
N=Asparagioe, P=Proline, Q^Glutamlne, R=Arglnlne, S=Serine, 
'M'hreonlne, V=Vallne, W=Tryptophan, Y«^ryrosine, 
X=Unknown, *=Stop eodon, A=posslble nudeotide deletion, 
\-possible nudeotide Insertion 










RRNTTNPS/CK*IRKLQGVI 


.3380 


A 


1443 


794 


ARRGELAGGGRA SGGRS GGDGGG GGG ARAPEG 

VRAPAAGQPRATKGAPPPPGTPPPSPMSSAIERKS 

LDPSEEPVDEVLQIPPSLLTCGGCQQNIGDRYFLK 

AIDQYWHEDCLSCDLCGCRLGEVGRRLYYKLGR 

KLCRRDYLRLFGQDGLCASCDKRIRAYEMTMRV 

KDKVYHLECFKCAACQKHFCVGDRYLLINSDIV 

CEQDIYEWTKINGMI 


3381 


A 


945 


474 


SLKIJIKPPLPTDGVHFVFVESQLDFWGPQEMLT 
QQGMALQhTyT)KKLVKClEELCQKQEELCWQIQ 
QEEDKKQRLQNEVRQLTEKLACVNEKLARVNE 
NLARKIASCSKFYQTLA£TEATYLKIIJESF*\TLLS 
VRKREAGNLTKATAPDQKSSGGRDS 


3382 


A 


1 


1458 


GIRGKMADRGGVGEAAAVGASPASVPGLNPTLG 

WRERLRAGLAGTGASLWFVAGLGLLYALRIPLR 

IX^ENIAAVTVFLNSLTPKFYVALTGTSSLISGLIFI 

FEWWYFHKHGTSFIEQVSVSHLQPLMGGTESSIS 

EPGSPSRNRENETSRQNl^ECKVWRNPLNLFRGA 

EYRRYTWVTGKEPLTYYDMNLSAQDHQTFFTC 

DTDFLRPSDTVMQKAWRERNPPARIKAAYQALE 

LN/E*LCHCICSTG*GRSNNYCRC*KVI*TGTQGR 

RNNL*AVTAVPAPKSSA*SSTEERYQCTGIY*LKI 

GNVCKKIRKNKRSSKNNERFDE 4 ISSSYHVEHP* 

KSL\KSLLELQAYPDVQAVLAKYDDISLPKSAAIC 

YTAALLKTRTVSEKFSPETASTRGLSAAEINAVD 

AIHRAVEFNPHVPKYLLEMKSLILPPEHILKRGDS 

EAIAYAFFHLQHWKRIEGALNLLQCTWEGSKYS 

FPKVTLISLTIH 


3383 


A 


282 


2443 


RGKGFKEFFLGVCQTFIPCLCAEGIQLQFFCSGSG 
SSPLLKDLESMKTGLrTlTLLGTAAAIPTNARLLS 
DHSKPTAETVAPDKTAIP;;^! ^/^E A^jT 
TEDDSHHKAlfW'& 

QEIXjIEGFKRDSIXjSL*VW^\EYGTKIJCGT^ 

KEDMSEPQEKKLSENTDFLAPGVSSFTDSNQQBS 

ITKREENQEQPRNYSHHQLNRSSKHSQGLRDQG 

NQEQDPNISNGEEEEEKEPGEVGTHNDNQERKTE 

\U>REHANSKQEEDNTQSDDBLEESDQPTQVSKM 

QEDEFDQGNQEQEDNSNAEMEEENASNVNKHIQ 

ETEWQSQEGKTGLEAISNHKETEEKTVSEALLME 

FTOIX5NTTPRNHGVDDDGDDDGDDGGTDGPRH 

SA\SDDYFHPKPGLFWEAERA\HS1AYSPSKLREQ 

REKVHENEMGTTEPGEHQEAKKAENSSNEEETS 

SEGNMRWHAVDSCMSFQCKRGHICKADQQGKT 

SLVSCQDPVTVCPPTKPLDQVCGTDNQTYASSCH 

LFATKCRLEGTrCKGHQLQLDYFG\ASKSIPT\CRD 

FBVIQ\rTLRMRDW\LKNILMQLYEANSEHAGYL 

NEK\QRKKVKKIYL\DEKRLLAGDHPIDLLLRDFK 

KNYHMYVYPVHWQFSEIJXJHPMDRVI,THSELA 

P1JRASLWMEHCITRFFEECDPNKDKHITLKEWG 

HCFGDCEEDIDENLLF 


3384 


A 


3166 


928 


PSRPHPIHAAMAGPEGFQYRALYPFRRERPEDLE 

LLPGDVLWSRAALQALGVAEGGERCPQSVGW 

MPGLNERTRQRGDFPGTYVEFLGPVALARPGPR 

PRGPRPLPARPRDGAPEPGLTLPDLPEQFSPPDVA 

PPLLVKLVEAIERTGLDSESHYRPELPAPRTDWSL 
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SEQBD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-AIanine OCysteine, D=Aflpartic Add, 
E=Glutamic Add, ^Phenylalanine, G^Grydne, H^Hlstidine, 
Msoleudne, K»Lysine, LHLeudne, M=M ethfonine, 
N-Asparagine, P=ProIine, Q=Glutaminc, R»Arginine, S=Serine, 
T-Threonine, V=Valine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *=*Stop codon, /^possible nudeotide deletion, 
Vposslble nudeotide insertion 










SDVDQWDTAALADGIKSFIJLALPAPLVTPEASAE 

ARRAl^AAGPVGPALEPPTLPLHRALTLRPLLQ 

HLGRVASRAPALGPAVRALGATFGPLLLRAPPPP 

SSPPPGGAPDGSEPSPDFPALLVEKLLQEHLEEQE 

VAPPALPPKPPKAK\PASTVPGPNGGSPPSL\QDA 

EWYWGDUSREEVNEKLRDTPDGTFLVRDASSKI 

QGEYTLTLRKGGNNKLIKVFHRDGHYGFSEPLTF 

CSWDLINHYRHESLAQYNAKLDTRLLYPVSKy 

QQIXJ1YKEDSVEAVGAQLKVYHQQYQDKSREY 

DQLYEEYTRTSQELQMKRTAIEAFNETIKIFEEQG 

QTQEKCSKEYLERFRREGN/QTBCEMQRILLNSER 

LKSRIAVEIHESRT\KL\EQQLLVPRASDNKRD/IDK 

PH*TSLKPDLMQLRKIRDQYLVWLTQKGARQKK 

INEWIXjIKNETHXJYALMEDEDDLPHHEERTWY 

VGKINRTQAEEMLSGKRDGTFLIRESSQRGCYAC 

SVVVDGDTKHCVIYRTATGFGFAEPYNLYGSLK 

EL\OJrIYQHASLVQHNDALT\TIlAHPVRAPGPGP 

PPAAR 


3385 


A 


43 


2372 


TRDWSWKJELCFNHYNKETTN 

KHFLGPFRELRSQGNQVILNLGKERCQLRETGLK 

LYLPGMDSARHfflSHSTSAGPIPSQKEEEMTESQ 

GTVTFKDVAmFTQEEWKRIJJPAQRKLYRNVML 

^TWNNLITVGYPFTXPDVIFKLEQEEKPWVMEEE 

VLRRHWQGEIWGVDEHQKNQDRLLRQVEVKFQ 

KTLTEEKGNECQKKFANVFPLNSDFFPSRHNLYE 

YDLFGKCLEHNFDCHNNVKCLMRKEHCEYNEP 

VKSYGNSSSHFVITPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFDCMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKS^roHEKIHTGEKPYECNECGKAFS 

LHMI>6HTvi]£C?YK^^^ 

TGEKPf-.iCNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRIC/ FSHKKNFITHQKIHTREKPYECNEC 

GKAHQMSKL^iHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYIX^NECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFOVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFITHQKIHTTlE/KPFKCh^ 

UtiHTGEKFYECSNCRKAFSHKEK^ 

QSYKCNECGKAFTKMSNLIRHQRIrrrGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

IJiMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKFreCNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAHQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

K5NLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDChJECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KKHTHQKIHTRENPI^VIIVEKASIRLWTSSDI 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amioo 
add residue of 
peptide 
seqoeoce 


Amino acid sequence (A»Alanlne OCysttine, D=>Aspartic Add, 
E=Glutamic Add, F=Pbenylalanlne, G-Glycine, H=Histidine, 
I=Isoleucine, K=»Lysinc, L= Leu cine, M-Methionine, 
N=nAsparagine,P*=Proline, Q=Giutamlne, R^Arglnine, S=Serine, 
T^Threonine, V=VoJine, W=Tryptophan, Y-iyrosine, 
X=KJnknown, *=Stop codon, possible nndeotfde deletion, 
\=possJble Dudeotide insertion 


3386 


A 


201 


1032 


WDDYPQGALRRREAAEGLHFLGPPGRVRGQLR 

GITGPAWYCHSPSHSII^AFCHIJPTPSRCPAMAR 

PPVPGSVWPNWHES/RRGQGVPGLHSAQEPPAG 

VWAA*AASAAAA\LSIDTASYKIFVSGKSGVGKT 

ALVAKIjVGLEVPWHHETTGIQTTVVFWPAKLQ 

assrvvmfrfefwix:gesalki<tohmixacme 
ntdafijlfsftdrasfedlpgqlariageapgv 
vrmvigskfdqymhtdvperdltafrqawelpl 
lrvksvpgrrlg 


3387 


A 


86 


96 


GSSPDPASLITMKNQDKKNGAAKQSNPKSSPGQP 

EAGPEGAQERPSQAAPAVEAEGPGSSQAPRKPEG 

AQARTAQSGALRDVSEELSRQLEDILSTYCVDNN 

QGGPGEDGAQGEPAEPEDAEKSRTYVARNGEPE 

PTPWNGEKEPSKGDPNTEEIRQSDEVGDRDHRR 

PQEKKKAKGLGKEITLDvlQTLim^TPEEKLAAL 

CKKYAELLEEHRNSQKQMKLLQKKQSQLVQEK 

DHLRGEHSKAVLARSKLESLCRELQRHNRSLKE 

EGVQRAREEEEKRKEVTSHFQVTLNDIQLQMEQ 

HMERNSKLRQENMELAERLKKLIEQYELREEH1D 

KVFKHKDLQQQLVDAKLQQAQEMLKEAEERHQ 

REKDFLLKEAVESQRMCE1MKQQETHLKQQLA 

LYTEKreEFQNTLSKSSEVFTTFKQEMEKMTKKI 

KKLEKETTMYRSRWESSNKALLEMAEEKTVRD 

KELEGLQVKIQRLEKLCRALQT/GAQ*PVRGQRW 

GSHRTSAVRIFS 


3388 


A 


98 


3197 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKK 
NKGKERRDLDDLKKEVAMTEHKMSVEEVCRKY 
NTDCVQGLTHSKAQEILARDGPNALTPPPTTPEW 
VKFCRQLFGGFSILLWIGAILCFLAYGIQAGTEDD 
PSGDl^YLGIVLAAVVIITGCFSYYQEAKSSKIME 

EIKC-GDRVPADLRIl^! HGC'K 7DNSSLTGBSEPQT 

RSPDCTHEVNPLKTRNIT. /FSNNFVEGTARGVWA 

TGDRTVMGRIATIASGUiV-KTPIAffilEHnQLIT 

GVAVFLGVSFFH^LlLGYTWLEAVIFLIGnVANV 

PEGIXATVTVCLTLTAKRMARKNCLVKNLEAVE 

TIX3STSTICSDKTG1LTQNRMTVAHMWFDNQIH 

EADTTEDQSGTSFDKSSHTWVALF'H^LLGFCNR 

PYFKGGQDNIPVLKRDVAGDASESALLKCIELSS 

GSVKLMRERNKKVAEIPFNSTNKYQLSIHETEDP 

NDNRYLLVMKGAPERILDRCSTILLQGKEQPLDE 

EMKEAFQNAYLELGGLGERVLGFCHYYLPEEQF 

PKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAA 

VPDAVGKCRSAGKVIMVTGDHPrrAKAIAKGV 

GIIFEGNETVEDIAAROWVSQVNPRDAKACVIH 

GTDLKDFTSEQIDEILQNHTEIVFARTSPQQKLirV 

EGCQRQGAIVAVTGDGVNDSPALKKADIGVAM 

GIAGSDVSKQAADMELLDDNFASIVTGVEEGRLI 

rT>NLKKSIAYTLTSNIPEITPFLLFIMAMPLPLGTI 

mCIDLGTDMWAISLAYEAAESDIMKRQPRNPR 

TDKLVNERLISMAYGQIGMIQALGGFFSYFVILA 

ENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQW 

TYEQRKWEFTCHTAFFVSIVYVQWADLIICKTR 

RNSWQQGMKNKILIFGLFEETALAAFLSYCPGM 

DVAIJlMYPLKPSWWFCAFPYSFLIFVn(T)EIRKLI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Afanfne OCysteine, D=Aspartic Add, 
E=G1utamJc Add, ^Phenylalanine, OGIydne, tt=Histidine, 
I^lsoleudne, KHLysine, LHLeudne, M^Methionine, 
N=Asparagine, P^ProUnc, Q=Glutamine, R=Arginint, S= s Serine, 
T-Threonine, V~Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nudeotide deletion, 
possible nudeotide insertion 










LRRNPGGWVEKBTYY 


3389 


A 


45 


5250 


VERLLGCRNSKRTWRMLISKNMPWRRLQGISFG 

MySAEELJCKLSVKSITNPRYLDSLGNPSANGLYD 

LALGPADSKEVCSTCVQDFSNCSGHLGHIELPLT 

VYNPLLFDKLYL1XRGSCLNCHMLTCPRAVIHLL 

LCQLRVL^VGALQAVYELERILNRFLEENPDPSA 

SEIREEIJEQYTreiVQNNLLGSQGAHVKNVCESK 

SKLIALFWKAHMNAKRCPHCKTGRSVYRKEHNS 

KLTITFPAMVHRTAGQKDSEPLGIEEAQIGKRGY 

LTPTSAREHI^ALWKNEGFFLNYIJ^MDDDGM 

ESRFNPSWFXDFLVWPSRYRPVSRLGDQMFTN 

GQTVNLQAVMKDVVLIRKLLALMAQEQKLPEE 

VATPTTDEEKDSLIAEDRSFLSTLPGQSLIDKLYNI 

WIRLQSHVNIVFDSEMDKLMMDKYPGIRQILEK 

KEGLFRKHMMGKRVDYAAASVICPDMYINTNEI 

GIPMVFATKLTYPQPVTPWNVQELRQAVINGPN 

VHPGASMVINEDGSRTALSAVDMTQREAVAKQ 

LLTPATGAPKPQGTKIVCRHVKNGDIUXNRQPT 

LHRPSIQAHRARILPEEKVLRLHYANCKAYNADF 

DGDEMNAHFPQSELGRAEAYVIACTDQQYLVP 

KDGQPLAGLIQDHMVSGASMTTRGCFFTREHYM 

ELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQW 

ST1XINIIPEDHIPLNLSGKAKITGKAWVKETPRSV 

PGFNPDSMCESQVIIREGELLCGVLDKAHYGSSA 

YGLVHCCYEIYGGETSGKVLTCLARLFTAYLQL 

YRGFTLGVEDILVKPKADVKRQRIIEESTHCGPQ 

AVRAALNLPEAASYDEVRGKWQDAHLGKDQRD 

FNMIDLKFKEEVNHYSNEINKACMPFGLHRQFPE 

NTLQLMVQSGAKGSTVNTMQISCLLGQIELEGRS 

TPLMASGKSLPCFEPYEFTPRAGGFVTGRFLTGIK 

PPEFFFHCMA 3IIEGLY^TAVK7 : ^oYlQREIIK * 

HLEGLWQYDLTViv >X3SWQFLYGEDGLDIP i 

KTQFLQPKQFPFLASNYEVIMKSQHLHEVLSRAD 

PKKAIifflFRAIKKWQSKHPOTlJLRRGAFLSYSQ 

KIQEAVKALKLESENRNGR/RPWDS/G/RMLRMW 

YEIJ^EESRRKYQKKAAACPDPSLSVWRPDIYFAS 

VSETFETKVDDYSQEWAAQTEKSYEKSELSLDR 

LRTLLQLVKWQRSLCEPGEAVGLLAAQSIGK^T 

QMTLNTFHFAGRGEMNVTLGIPRLREILMVASA 

MKTPMMSVPVUOTKJCAL^ 

GEVLQKIDVQESFCMEEKQNKFQVYQLRFQFLP 

HAYY(^EKCLRPEDIIJIFMETRITK1XMESIK^ 

NNKASAFRNVNTRRATQRDLDNAGELGRSRGE 

QEGDEEEEGHIVDAEAEEGDADASDAKRKEKQE 

EEVDYESEEEEEREGEENDDEDMQEERNPHREG 

ARKTQEQDEEVGL/GH*GGPVPSRPPDAAPETHP 

QPGAPGAVEAMERRVQAVREIHPFIDDYQYDTEE 

SLWCQVTVKLPLMKINFDMSSLVVSLAHGAVIY 

ATKGITRCIJLNETTNNKNEKJBLVLNTEGI^ 

KYAEVLDLRRLYSNDIHAIANTYGIEAALRVIEK 

EDCDVFAVYGIAVDPRHLSLVADYMCFEGVYKP 

LNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSH 

DELRSPSACLWGKWRGGTGLFELKQPLR 


3390 


A 


2 


2080 


ELPPLEGPPAQASPSSTMLGEGSQPDWPGGSRYD 
LDEEDAYWLELINSELKEMERPELDELTLERVLE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
imrittitidt 

location 
corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCystdne, D»Aspartic Add, 
E^Glutamic Add, ^.Phenylalanine, OGJydne, H^Hlstidine, 
^^soIeudnCf K^4*ysinC) L^Leudnc, M! ss Mcthi0Dii]e ) 
N=Asparagine, P^ProIlne, Q=GIutamlne, R-Arginine, S^Serine, 
^Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X^Unfenown, *=Stop codon, A=possible nudeotide deletion, 
V*possible nudeotide insertion 










ELETLCHQNMARAJETQEGLGEEYDEDWCDVC 

RSPEGEDGNEMVFCDKCNVCVHQACYGILKVPT 

GSWLCRTCALGVQPKCLLCPKRGGALKPTRSGT 

KWVHVSCALWEVSIGCTEKMEPITKISHIPASR 

WAl^CSlX:KECTGTCI(^SMPSaVTAFHVTCAF 

DHGLEMRTILADNDEVKF1CSFCQEHSDGGPRNE 

PTCEPTEPSQAGEDLEKVTLRKQRLQQLEEDFYE 

LVEPAEVAERLDLAEALVDFIYQYWKLKRKANA 

NQPIXTPKTDEVDNIAQQEQDVLYWaKLFTHL 

RQDLERVRNLCYMVTRRERTKHAICKLQEQIFH 

LQMKLIEQDLCRAGLSTSFPIDGTFFNSWLAQSV 

QITAENMAMSEWPLNNGHREDPAPGLLSEELLQ 

DEETLLSFMRDPSLRPGDPARKARGRTRLPAKK 

KPPPPPPQDGPGSRTTPDKAPKKTWGQDAGSGK 

GGQGPPTRKPPRRTSSHLPSSPAAGDCPILATPES 

PPPLAPETPDEAASVAADSDVQVP\GPAASPKPLG 

RLRPPPREPR*T\RRLPGCVARPDAGDGDHLSAVA 

ERPKV\SLHFDTETDG\YFS\DGEMSNS\DV\EAED 

GGVQRGPREAGAKEWVRMGVLAS 


3391 


A 


1555 


327 


nsflhflhlkvrtmflfpsfpvlllsvvtascskt 

kacadtqktcsmitcgipvtngtpgrdgrdrpk 

gekgepglgqvsvas*istsgrcssksvlepatrg 

lkhrlgeaplssgpmlhseqpl*naiasktklfv 

dslgshistqelgvcgcpfrgvsclvgelalvqa 

lh*vagesfffgsdhwligcaggeqewsiei:lgk 

kkrvtatgssslclatgqglrglqgppgkmgpp 

gntgtsgipgprgqkgdrgdnsvaeaklanler 

kl*slrseldhtkkl*pfslgk\msgkklfvtnge 

rmpfskvkalcaglqatvaapknaeenkaiqdv 

akdtaflg.tbeategqfnryxtggrltysnwk^ 

7^EI*NDHGSCZl>CViLlJINGLv, , I _~ CTSSF1AICZ 

?.YA- * 


33Pv, 


A 


218 


1773 


GGSRRNQRRSIPVLGYF1JCQKKMTKAQESLTLE 

DVAVDFTWEEWQFLSPAQKDLYKDVMLENYSN 

LVSVGYQAGKPDALTKLEQGEPLWTLEDEIHSP 

AHPEffiKADDHLQQPLQNQKTLKRTGQRYEHGR 

TLKSYIXjLTNQSRRYNRKEPAEWGIXjAFLHDN 

HEQMPTEmFPESRKPISTKSQFLKHQQTHNlEKA 

HECTDCGKAFLKKSQLTEHKRIHTGKKPHVCSL 

CGKAFYKKYRLTEHERAHRGEKPHGCSLCGKAF 

YKRYRLTEHERAHKGEKPYGCSECGKAFPRKSE 

LTEHQRIHTGIKPHQCSECGRAFSRKSLLWHQR 

THTGEKPHTCSECGKGHQKGNLNMQRTHTGEK 

PYGCIDCGKAFSQKSCLVAHQRYHTGKTPFVCPE 

CGQPCSQKSGLIRHQKfflSGEKPYKCSDCGKAFL 

TKTMLIVHHRTHTGERPYGCDECEKAYFYMSCL 

VKHKRIHSREKRGD/CSEGGKSFHSKSQLKS* *TC 

AGEKPC*YGNCGNGGRAV 


3393 


A 


46 


1464 


ARSLSGAPSGSSRQDGTSLLRTGAGYSSSQSIETL 

SLPPGPSHLVGDKSQGGRSCQGQITSAASGKTSK 

SEPNHVIFKKISRDKSVTVrYLGNRDY\IDHV\SQV 

QPVDGWLVDPDLVKGKKVYVTLTCAFRYGQE 

DIDVIGLTFRRDLYFSRVQVYPPVGAASTPTKLQ 

ESIXKKLGSNTYPFLLTFPDYLPCSVMLQPAPQD 

SGKSCGVDFEVKAFATOSTDAEEDKIPKKSSVRL 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alanine OCysteine, D=*Aspartic Add, 
£>*Glutamic Add, Phenylalanine, G=Gtycine, H=Histidine t 
I=Isoleudne, K^Lysine, L^Leudne, M = MethionJoe, 
N»Asparaglne, P^ProIIne, Q=G!utamine, R»Arginine, S=Serine, 
T=Threonlne, V«VaIinc, W^Tryptophan, Y^^Tyrosine, 
X«Unknown, *«Stop cod on, ^possible uudeotide deletion, 
\=possible nudeotide insertion 






• 




LIRKVQHAPLEMGPQPRAEAAWQFFMRDKPLH 

LAVSLNKRDLn*MGSPIPWVSVPVNNTEKPVKKI 

KA\SVEQVANVVLYS\SDY\YVKPVAMEEAQEKV 

PPWSTWTKA\LTLL\PWLVNKRERRGIALDGKJ^ 

EDTNIJVSSTIIKEGIDRKRS WEIL VSYPDQR* SSTV 

SGFlX}RASPSQ*SRPT*RSQFia\MHPQPVEDPA^ 

ESYQDANLVF\EEFARP*JLKDAGEA*\EGKRDQE 


3394 


A 


211 


1591 


RPPTMAADQRPKADTLALRQmSSSCRLFFPEDP 

VKIYRAQGQYMYDEQGAEYIDCISNVAHVGHCH 

PLWQAAHEQNQVLNTNSRYLHDNIVDYAQRLS 

ETLPEQIXIVFYFI^SGSEANDLALRLARHYTGH 

QDVVVIJDHAYHGHLSSLIDISPYKFRNLDGQKE 

WYHVAPLPDTYRGPYREDHP\TrTVnEIX}LEKAFS* 

KRWQGRNRQICRRQIAAFFAESLPSVGGQIIPPA 

GYFSQVAEHIRKAGGVFVADEIQVGFGRVGKHF 

WAFQLQGKDFVPDIVTMGKSIGNGHPVACVAAT 

QPVARAFEATGVEYFNTFGGSPVSCAVGLAVLN 

VLEKEQIXJDHATSVGSFLMQLLGQQKTKHPIVG 

DVRGVGLFIGVDLIKDEATRTPATEEAAYLVSRL 

KENYVLLSTDGPGRNILKFKPPMCFSLDNARQV 

VAKLDAILTDMEEKVRSCETLRLQP 


3395 

* 


A 


1 


1424 


FRDGFSLRCGCNAELPGRGGDDAADRAIQRFLR 

TGAAVRYKVMKNWGVIGGIAAALAAGIYVTWG 

PITERKXRRKGLVPGLVNLGNTCFMNSLLQGLSA 

CPAF1RWLEEFTSQYSRDQKEPPSHQYLSLTLLHL 

LKALSCQEVTDDEVLHASCLLDVLRMYRWQISS 

FEEQDAHELFHVITSSLEDERDRQPRVTHLFDVH 

SLE\HSQK*LPKQITCRTRGSPHPTSNHWKSQHPF 

HGRLTSNMVCKHCEHQSPVRFDTFDSLSLSIPAA 

TWGHPLTLDHCLHHFISSESVRDWCDNCTKT? \ 

QRLSWSSHGTPLKRHEHVQFNEFi l^IYKYKi, 

LGHKPSQHNPKU^IKhTO^ 

NQPGAPKTQIFMNGACSPSLIJ>TLSAPMPFPLPV 

VPDYSSSTiXFRLMGSCRPPWETWHSGTLCSFTD 

GPHL 


3396 


A 


109 


107 


TQEAGLIFFSPPFSLSLSLSLPLSLFLLSHPHSRTPP 

NRTPRRTRIPQRPAVMYSPLCLTQDEFHPFIEALL 

PHVRAFAYTWFNLQARKRKYFKKHEKRMSKEE 

ERAVKDELLSEKPEVKQKWASRLLAKLRKDIRP 

EYREDFVLTVTGKKPPCCVLSNPDQKGKMRRID 

CLRQADKVWRLDLVMVILFKGIPLESTDGERLV 

KSPQCSOTGLCVQPHmGVSVKELDLYLAYFVH 

AADSSQSESPSQAK*R*H*GPARKWDIWGFQ\DS 

FVT\SGVRSVT*A*LRVSQTPI\AAG\TGPNFSLSD 

LESSSYYSMSPGAMRRSLPSTSSTSSTKRLKSVED 

EMDSPGEEPFYTGQGRSPGSGSQSSGWHEVEPG 

MPSPTTLKKSEKSGFSSPSPSQTSSLG\TAFTQHHR 

PVITGTQSKFHIATPSIL\HFPRHSPFFQQPGPYFSH 

PAIRYHPQETLKEFVQL VCPD AGQQ AGQPNG SS 

QGKVHNPFLPTPMLPPPPPPPMARPVPLPVPDTK 

PPTTSTEGGAASPTSPTTRS/PGRTRPQQPFL/SYG 

PP*PSNALIGGGGGGAGERAGERADLEM 


3397 


A 


1 


2002 


TGTLTEDGLDVMGVVPLKGQAFLPLVPEPRRLP 
VGPIJLRAIATCHALSRLQDTPVGDPMDLKMVES 
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SEQ to 
NO: 


"Metfaod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«AIanine OCystrine, D^Aspartic Add, 
EXrlutamic Add, ^Phenylalanine, G=Glycine, ENHQstidlne, 
f^Isoleudne, K-Lysine, L»Leudne, M=Methionine, 
P£=Asparagine, P^Proltne, Q=GIatamlne, R=Argininc, S=Serine, 
T=Threonlne, V=Valine, W«Tryptophan, Y-OTyrosine, 
X^Unknown, *=Stop codon, possible nodeodde deletion, 
^possible nudeotide insertion 










TGWVLEEEPAADSAFGTQVLAVMRPPLWEPQLQ 

AMEEPPVPVSVLIIRFPFSSALQRMSVVVAWPGA 

TQPEAYVKGSPELVAGLCNPETVPTDFAQMLQS 

YTAAGYRWALASKPLPSVPSLEAAQQLTRDTV 

EGDLSLLGLLVMRNLLKPQTTPVIQALRRTRJRA 

VMVTGDNLQTAVTVARGCGMVAPQEHLHVHA 

THPERGQPASLEFLPMESPTAVNGVKDPDQAAS 

YTVEPDPRSRHLALSGPTFGHVKHFPKL^ 

QGTVFARMAPEQKTELVCELQKLQYCVGMCGD 

GANDCGALKAADVGISLSQAEASVVSPFTSSMA 

SIECVPMVIREGRCSLDTSFSVFKYMALYSLTQFI 

SVLILYTINTNLGDIXJFLAIDLVITTWAVLMSRT 

GPALVLGRVRPPGAIJ^WVLSSUJXJMVLVTG 

VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNYEN 

TVWSLSSFQYLILAAAVSKGAPFR\RPLTNNVPF 

LI^SAL*SSVLVVLVLSPGLLHGPLALRNITDTGF 

KLLLVGLVTLNFVGGLHAGERARPVPPRLPAPPP 

AQAGVSKKRFKQLERELAEQPWPPLPAGPLR 


3398 


A 


758 


1368 


FPFRMLTGYLYLMWRRKAFWSGTQRHPLPGGL 

ICRRRRPGRGPWPAPGGQGVGPSAL*KAGSPPAN 

RPGQGE/PGLISPKPVTEVLPDVQGAPVPVPPLPT 

PPSLPHLQNQPP/TVQHYLLSFSWKPSQGPE*RA* 

PSPLPPAAMRPDG*PGPASQGPDQPG\PCPPASLP 

TSPPGKGFQKTETRKHPPPRQQHKPKCTANRPLA 

SFL 


3399 


A 


906 


1091 


HHHHHHHHHHHHHLV AFGKVQ*LQNSPSSSSS S 
SSGCFWQARFSSYRTLHHHHHHHHHHHHH 


3400 


A 


1838 


325 


PFLSVHRSPHGPSKLCDDPQASLVPEPVPGGCQE 

PEEMSWPPSGEIASPPELPSSPPPGLPEVAPDATST 

GIJPDTPAAPETSTNYPVECTEGSAGPQSLPLPILE 

FV5C4PCS\ / KDQTPLci.: V5ITr:rNTCrC. : . . "TT 

PETSPPi 1 ; ; ??PSSTPCSAHLTPSSLFPSS : -SSJton^" 

KFYNFVILHARADEHIALRVSGRSWEALr ,TDG 

ATFCEDFQVPGRGELSCLQDAIDHSAFnLLLl TN 

\FIX^\LSLHQVNQAMMSNLT\R(^SQDCVIr^bLP 

\LESSPARLSSDTASLLSGLVRLDEHSQIFARKVA 

NTr^HRLQARKAMWRKBQDTRAIJmQSQHLD 

GERMQAAALNAAYSA YLQSYLSY QAQMEQLQV 

AFGSHMSFGTG AFY GARMPFGGQVPLG APPPFP 

TWPGCPQPPPLHAWQAGTPPPPSPQPAAFPQSLP 

FPAVPKPFPTASTAPPSEPKGWQPVLIIHHAQMVT 

SWG*NKH\MWNQRGSQAPEDKTQEAE 


3401 


A 


153 


1389 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI 

KKEISPLnGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DWFEESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETTVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK 

WAGMGNSGITTELTLKYnTNVTTI^GISSVNA 

GQDVNIIITYKTSL*NTNLGDVAKGLQSSNFG 

Vj l Y 1 roL IrQ I K 1 GvANJUL ITrVJi'M WQiv 1 1 FRME 

NLQLH/CPEDASTKKANVILPVESSKSFQEFYSTS 

CI^PCENNWNLKKGVFSfl^ 

PKLLFRLTVIILTFKCYYVLFHLJINARVLDV 


3402 


A 


153 


1389 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI 
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SJEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Brst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A°Alanine OCysteint, I^Aspartic Add, 
E«Glntamic Add, ^Phenylalanine, G=Glycine, H^Histldine, 

J> JUHf 1 V U kill Cj X\^^AJj tflU j Jv^JUVU^Jll^f ITl^TrJitMHWIlitltj 

N=Asparagtne, P 1 =ProlIne, Q=Giutamine, R=Arginine > S»Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X=Un known, *=Stop codon, A-possible nucleotide deletion, 
\=possib!e nudeou'de insertion 










KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DVVFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISEnVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK 

VNAGMGNSGITTELTLKYIITNVTTI^TGISSVNA 

GQDVNIHTYKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTGV\NLLTLVE*MWQETYrTlME 

NIXJLII/CPEDASTKKANVILPVESSKSFQEFYSTS 

OJSPCENNWNLKKGVFNKSRCnCSKLAEVWIFI 

PKIXFRLTVI1LTFXCYYVLFHLHNARV1J3V 


3403 


A 


609 


2765 


SRHCTPAERQNBTHRAPDFAMSAVLGHQPPFFPA 

LTLPPNGAAALSLPGALAKPIMDQLVGAAETGIP 

FSSLGPQAHLRPLKTMEPEEEVEDDPKVHLEAKE 

LWDQrIIKRGTEMVITKSGRRMFPPFKVRCSGLD 

KKAKYILLMDHAADIX3RYKFHNSRWMVAGKA 

DPEMPKRMYIHPDSPATGEQWMSKVVTFHKLKL 

TNMSDKHGFITLNSMHKYQPRFrnVRANDILKLP 

YSTFRTYLFPETEFIAVTAYQNDKITQLKIDNNPF 

AKGFRDTGNGRREKRKQLTLQSMRVFDERHKK 

ENGTSDESSSEQAAFNCFA\QASSPAA\PL*RTSNL 

KDF\SPSRG * RATPEAEEQRG STAPRPATRAKISP 

HPRRRSPAVTRAAPAVKAHLFAAERPRDSGRLD 

KASPDSRHSPATISSSTRGLGAEERRSPVREG\QA 

PAKVEEARALPGKEAFAPLTVQTDAAAAHLAQG 

PLPGLGFAPGLAGQQFFNGHPLFLHPSQFAMGG 

AFSSMAAAGMGPLLATVSGASTGVSGLDSTAM 

ASAAAAQGLSGASAATLPFHLQQHVLASQGLA 

MSPFGSLFPYPYTYMAAAAAA/SSAAASASVHRT 

P\FNLNTMRPRLRYSPYSIPVPVPTX5SSLLTTALPS 

XJ,vAJ^PnJS}VJ%^ ^; v..SPASYo\ v - * 

R3S\TLSSSSMSLSi >X(^^EAAT$ELQSIQRLVS 

GLEAKPDRSRSASP 


3404 


A 


1082 


1308 


LKKFLEVPQSYSLLI^Si'jFLQ\WRA*RPQNAIG*Q 
FHKTLVFFGIMRSAGDVLSTQVSCALRIMRTAGC 
SHSSP 


3405 


A 


1553 


559 


PRPrTQRI^RFAPrcRTAEFPFRRRAVVTRPAPPR 

ACTWGRSSPVTGLAVGAAVAMLTVAARSRPFA 

PVLSATSRGVAGALTVP*MQATVPATPEQPVLDL 

KRPFLSRESLSGQAVRRPLVASVGLNVPASVCYS 

HTDIKWDFSEYRRLEVLDSTKSSRESSEARKGFS 

YLVTGVTTVGVAYAAKNAVTQFVSSMSASADV 

LALAKIEIKLSDIPEGfcNMAFKWRGKPLFVRHRT 

QKEIEQEAAVE1^QLRDPQHDIJ)RVKKPEWVILI 

GVCTHLGCVPIANAGDFGGYYCPCHGSHYDASG 

RIRLGPAPLNLEVPTYEFTSDDMVTVG 


3406 


A 


83 


2671 


CLYPDFCRSVTCAMPCFTHRSCREDPGTSESREM 

DPVAFKDVAVNFTQEEWAIXDISQKNLYREVML 

ETr^NLTSIGKKWKDQNIEYEYQNPRRNFRSVT 

EEKVNEIKEDSHCGETFIPVPDDRLNFQKKKASP 

EVKSCDSFVCEVGLGNSSSNMNIRGDTGHKACE 

CQEYGPKPWKSQQPKKAFRYHPSLRTQERDHTG 

KKPYACKECGKNIIYHSSIQRHMVVHSGDGPYK 

CKFCGKAPTTWl^LYLIPIERTrrrGEKPYECKQCG 

KSKYSATHRIHERTfflGEKPYECQECGKAFHSPR 
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SEQED 
NO: 


Method 


Predicted 
beginning 

DDCJcottdC 

location 
corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysteine, B^Aspartk Add, 
E=€lptamle Add, F<4 > henyla!anf ne, G=dyclne, H=Histidine, 
^Isoleudne, K^Lysine, Ls^Leudne, M^Metbionine, 
N^Asparagtoe, P^Proline, Q=Glutnmine, R=Arginine t S»Serine, 
T=ThreonJne, V=Valbe, W=Tryptophan, Y^Tyrosine, 
XMUnknown, *=Stop codon* /«=possib!e nudeotide deletion, 
V-possJble nucleotide insertion 










SCHRHERSHMGEKAYQCKECGKAFMCPRYVRR 

HERraSRKKLYECKQCGKAI^SLTSFQTHIRMHS 

GERPYECKTCGKGFYSAKSFQRHEKTHSGEKPY 

KCKQCGKAFTRSGSFRYHERTHTGEKPYECKQC 

GKAFRSAPNLQSHGRTHTGEKPYECKECGKAFIF 

VNNLQSHERTQTHIRIHSGERRYKCKICGKGFYC 

PKSFQRHEKTHTGEKLYEC/TATFSSSFSSSSSF*Y 

HERTHTGEKPYKCEQCGKAFRAVSIL*MHGRTH 

PEEKPYECEQ*RKAFRSAPHL*IRGRTHNGEKPY 

ACKKCGKPFGSAQNLRIHERTQTHIMHSVERPYK 

CKICGRGFYSAKSFQTHEKSYTGEKPYECKQCG 

KAFVSFTSFRYHERTHTGENFYECKQFGKAFRSV 

KNLRFHKRTHTGEKPCEYMKRLTLEGKI^^ 

NVAKLSLLPVLFNIN1KEFTLGRNPISVSNVRKPLF 

LPlXFNIMKGLTWERNPMSVCrrVGKPSFLLVPFN 

IMKGLTLERSPMNISNVGKPSDQPRTFKCMEGLT 

LEKNPMNVSSMGKRSDLTRFFEYR 


3407 


A 


1426 


3 


PAAPSGASPGRVCGVETARPLGVQRRQSADEGP 

PGVAGLRHEPPTVWLGSVAHRGTWVCAHRWFG 

PAVTRAAQAATMVKLLVAKILCMVGVFFFMLL 

GSIXPVKIffiTDFEKAHRSKKILSIXNTFGGGVFL 

ATC\LTALLARC*GKSSRRSWSLGHISTDYPL\AE 

TILLLGFFMTVFLEQLILTFAQENAVLHRPGDLQR 

RIGRGQRLGV*EPLHGGRAGPRAVRGAPRPRPQP 

ERAGPLA\PSPVRLLSLAFALSAHSVFEGLALGLQ 

EEGEK W SLFVGV A VHETLVPVALGISMAGS AM 

PLRDAAKLAVTVSPMIPLGIGLGLGIEKAQGVPG 

SVASVLLQGPGGRHLSLFITFPGKSWPRSWRKKS 

DRLLKVIJF\LWGYTVlJVGMGLPQVVSGLAIVPA 

AGSPPGAPGRTQAASPGRASPKSEHCGPGPPPVH 

KGI -iSSTTL CPRS YTLST IUI.1LFK^L5LKSL . 

KK 


3408 


A 


106 


451* 


EARDRLAQSRAKEKELNSVASELSARQEESEHSH 

KHUEUIREFKKNVPEEIREMVAPVLKSFQAEVV 

ALSKRSQEAEAAFLSVYKQLIEAPALWELKLKSR 

PALGDSRVQQGQHDPKTDNQNTQQKAGFKEGW 

LAEASEREAFGPGFKDPVPVFEAARSLDDRLQFP 

SFDPSGQPRRDLHTSWKRNPELLSPKALKATQAE 

LI^LRRKYDEEAASKADEVGLIMTNLEKANQRA 

EAAQREVESLREQLASVNSSIRLACCSPQGPSGD 

KVNFTLCSGPRLEAALASKDREILRLLKDVQHLQ 

SSLQELEEASANQIADLERQLTAKSEAIEKLEEKL 

QAQSDYEEIKTELSILKAMKLASSTCSLPQGMAK 

PEDSLLIAKEAFFPTQKFLLEKPSLLASPEEDPSED 

DSIKDSLGTEQSYPSPQQLPPPPGPEDPLSPSPGQP 

LLGPSLGPDGTRTFSLSPFPSLASGERLMMPPAAF - 

KGEAGGLLVFPPAFYGAKPPTAPATPAPGPEPLG 

GPEPADGGGGGAAGPGAEEEQLDTAEIAFQVKE 

QLLKHN1GQRVFGHYVLGLSQGSVSEILARPKP\ 

WRKLHG* *GKEPFIKMKQFLSDEQNVLALRTIQV 

RQRGSITPRIRTPETGSDDAIKSELEQAKKEIESQK 

GGEPKTSVAPLSIANGTTPASTSEDAIKSILEQAR 

REMQAQQQALLEMEVAPRGRSVPPSPPERPSLAT 

ASQNGAPALVKQEEGSGGPAQAPLPVLSPAAFV 

QSIIRKVKSEIGDAGYFDHHWASDRGLLSRPYAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«AIanlne OCysteine, D»Aspartic Add, 
£~Glntamfc Add, ^Phenylalanine, G=Gtydne, H=Hi$tidloe, 
Msoleutfne, K^Lysine, I^Leudne, ^Methionine, 
N=Asparagine, P=ProUne, Q=Glutamine, RpArglnine, S«Serine, 
T-Tbreonine, V=VaIlne, W«Tryptophan, Y^Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide ddeHon, 
Impossible nndeotide insertion 










VSPSLSSSSSSGYSGQPNGRAWPRGDEAPVPPED 

EAAAGAEDEPPRTGELKAEGATAEAGARLPYYP 

AYWRTLKPTVPPLTPEQYELYMYREVDTLELTR 

QVKEKI^JCNGICQRIFGEKVLGLSQGSVSDMLSR 

PKPWSKLTQKGREPFIRMQLWLSDQLGQAVGQQ 

PGASQASPTEPRSSPSPPPSPTEPEKSSQEPLSLSLE 

SSKENQQPEGRSSSSLSGKMYSGSQAPGGIQEIV 

AMSPELDTYSITKRVKEVLTDNNLGQRLFGESIL 

GLTQGSVSDLLSRPKPWHKLSLKGREPFVRMQL 

WLNDPHNVEKLRDMKKLEKKAYLKRRYGLIST 

GSDSESPATRSECPSPCLQPQDLSLLQIKKPRWL 

APEEKEALRKAYQLEPYPSQQTIELLSFQLNLKT 

NTVINWFHNYRSRMRREMLVEGTQDEPDLDPSG 

GPGILPPGHSHPDFIPQSPDSFTEDQKPTVKELEL 

QEGPEENSTPLTTQDKAQVRIKQEQMEEDAEEE 

AGSQPQDSGELDKGQGPPKEEHPDPPGNDGLPK 

VAPGPLLPGGSTPDCPSLHPQQESEAGERLHPDP 

LSFKSASESSRCSLEVSLNSPSAASSPGLMMSVSP 

VPSSSAPISPSPPGAPPAKVPSASPTADMAGALHP 

SAKVNPNIX5RRHEKMA>nLNNirm,ERAANREE 

ALEWEF 


3409 


A 

■ 


162 


1710 


GPLSPGPYQCRPSLPAQLYPQSLMAAATLRTPTQ 

GTVTFEDVAVHFSWEEWGLLDEAQRCLYRDVM 

LENLALLTSLDVHHQKQHLGEKHFISNVGRALF 

VKTCTFHVSGEPSTCREVGKDFLAKLGFLHQQA 

AmXjEQSNSKSDGGAISHRGKTHYNWGEHTKAF 

SGKHTLVQQQRTLTTERCYICSECGKSFSKSYSL 

NDHWRLHTGEKPYECRECGKSFRQSSSLIQH|IR 

GHTAVRPHECDECGKIJSNKSNLIKHRRVHTGE 

RPYECSECGKSFNQRSALLQHRGVHTGEKPYEC 

TECGKSFS};! TJSL!TIH^ PJHSC >_ > Z TTEOjyJSI 

f - NSSIJEHiiRVHTGERPYKC- SCGkciFRQRSAL 

LQHRGVPTGERPYECSECGKFFT i'SSSLGKHQRV 

HTGSRPYECSECGKSFKJNSGIJKilR^VHTGEKP 

YECIE*KKSFSHNSSIJKHQRIHSR*KFYE\CKCG 

K\R*HPGESP*\^SECQ/KSFS*RPYLIECHTVHKG 

KTLLICRDVQLI 


3410 


A 


167 


789 


LCMKGISGGVRVAALAARAEREELPVPAMEPQP 

TAWGSPHPEAVLQLEVAPESSGPCTDTAKDQQS 

DKLPDLMPPA\EPLGSALELRASLEE)VAE\RGCE 

HGPSQQLPRCP*SWAWSEPWCQRPGCAV*APLP 

Y*REASFIYQSHSPAASGPFHSAGAGAVYLQAGG 

V/GEQmCEAVRKGSGSSSCSQRGP\PPPGMEVCPL 

LGFWAICP 


3411 


A 


1040 


887 


ASLSKPAGISTMPWALILLFLLIHSAVSVyQAGL 
TQPPSVSKDLRVQTATLTCTGNSNNVGHQGVIWL 
QQHQGHPPK1XSYRNNNRPSGISERLSAYKSGNA 
ASLTIYGLQTEHEAD* *CRPRRKLff KTARLFFFFL 
IDNEEYLLRVY 


3412 


A 


164 


83 


RRGIPGSASLSLTMCVRSCFQSPRLQWVWRTAFL 

KHTQRRHQGSHRWTHLGGSTYRAVIFDMGGVLI 

PSPGRVAAEWEVQNRIPSGTILKALMEGGENGP 

WMRFMRAEITAEGFLREFGRLCSEMLKTSVPVD 

SFFS1XTSERVAKQFPVMTEAITQIRAKGLQTAVL 

SNNFYU>NQK5rXPLDRKQFDVIVESCMEGICKP 
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SEQ10 
NO. 


Method 


Predicted 
beginning 

uuucuuuc 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 

IVmIUvII 

corresponding 
to test amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»AIanine OCysteine, D=Aspartic Acid, 
E=C!utamic Add, ^Phenylalanine, G=Glydne, H=Histld]nc, 

TnI«Al»tirlllfc. K~T.vdnr I =1 .purine Af=M#4hlnntn* 

N=Asparagine, P=ProUne, Q=GIutamine, R-Arginine, S=Serine, 
T«Tfareonine, V=Valine, W=Tryptophan, Y-Tyrosine, 
X=*Un known, *=Stop codon, ^possible nudeotide deletion, 
V=pcssible nudeotide insertion 










DPRIYKlX^LEQLGLQPSESIFlJDDLGTNIiCEAAR^ 

GnmKVNDPETAVKELEAIXGFTUlVGVIOTr^ 

\TCKTMEIPKDSLQKYLKDLLGIQTTGPLELLQFD 

HGQSNPTYYIRLANRDLVLRKKPPGT^ 

EREFRIMKALANAGVPVPNVLDLCEDSSVIGTPF 

YVMEYCPGLIYKDPSIJrXjLEPSHRRAIYTAMNTV 

LCKIHSVDLQAVGLEDYGKQGSTTWV/YSSRRA 

RGALLFLDWELSYPWGDPFADVGYSCLAHYLPS 

SFPVLRGINDCDLTQLGIPAAEEYFRMY CLQMGL 

PPTENWNFYMAFSFFRVAABLQGVYKRSLTGQA 

SSTYAEQTGKLTEFVSNLAWDFAVKEGFRVFKE 

MPFTNPLTRSYHTWARPQSQWCPTGSRSYSSVPE 

ASPAHTSRGGLVISPESLSPPVRELYHRLKHFME 

QRVYPAEPELQSHQASAARWSPSPLIEDLKVKQP 

W*GGRSGRTSWRLLALGCHT 


3413 


A 


105 


1573 


PESRHQCFSDRSSHFLTMEMEQEKMTMNKELSP 

DAAAYCCSACHGDETWSYNHPIRGRAKSRSLSA 

SPALGSTKEFRRTRSLHGPCPVTTFGPKACVLQN 

PQTIMfflQDPASQRLTWNKSPKSVLVDCKMRDAS 

LLQPFKELCTHLMEENMIVYVEKKVLEDPAIASD 

ESFGAVKKKFCTFREDYDDISNQIDFIICLGGDGT 

LLYASSLF(^SVPPVMAFHLGSLGFLTPFSFENFQ 

SQVTQVIEGNAAVV17RGSRIXVRVVKELRGKK 

TAVHNGLGEKGSQAAGLDMDVGKQAMQYQVL 

NEWEDRGP SS YLSNVD VYLDGHLITTVQGD/G * 

GPQHLSWGP*AFLGRE*RLRLSLSGVTVSTPTGST 

AYAAAAGASMIHPNVPAIMITPICPHSLSFRPIVV 

P AG VELKIMLSPEARNTA WV SFDGRKRQEIRHG 

DSISITTSCYPLPSICVRDPVSDWFESLAQCLHWN 

VRKKQAHFEEEEEEEEEG 


3Ui : 1 


,A 




2602 . 


ALGLPDLTiCPFTF * 1 JiHEKMA 7GVLTQTVGPW 

PRPVAYLSKQLDGVSKGWPPCLRALAATALLAQ 

EADKLTLGQNLNIKAPHAWTLMNTKGHHWLT 

NARLTKYQSIJH^ENPHITIEVCfr^^ 

SPGEHNCVEVLDSVYSSRPDLRDQPWASSVDWE 

LYMIXjSSFINSQGERCAGYAVVTLDAVIKAKLW 

LQGTSAQKAELIALTRAVELSEGQESLEELLGRY 

FYVSHLPAFAKAVAQLCITCRQHNARQSPTVSPH 

IQAYGAAPFEDLQVDFTEMPKCGGNKYLLVLTC 

TYSGWVEAYPTRTEKAYEVTRVLLRDLIPRFGLP 

LRIGSHNGPWVADIJX^EINVDTGVIWATW^ 

EKDPVQ1XJKGKSGPSCTKGQCNPLELVTTNPLDP 

RWKKGERVTLGINGAGLNPRVNILVRGEVYKCS 

LEPWQTFVTDELNVPITEFPGKTRhnLFLQIJVEHV 

AQSLTVTSCYVCGGTVIADQWPWEARELVPTDP 

WDEFPAQKNHPDNFWVLKASHRQYYIARVEKD 

FILPVGRLHGG/RSNHTEKNPFSKFPKLQTV*AHP 

ESHRD WTAPTGL YWICGHRA YTKLP\A SSCVIGTI 

KPSFFLLSDCTGELLGFPVYASR\KSIAIRN*NNDK 

WPPERnQYYGPAT*AQDGSWGYRIPIYMINRIIRL 

QA VIXIITATGRALT1LAQQETQMRNAIY QNRLA 

LDYLIjVAEGEVCRKFNLTNCCLHIDNQGQVVED 

IVRDMTKVAHVPVQVWHGFDPGAMFRKWFPAL 

GGFTCTLimvnVIGTYLLLPRLIJVlXQMIKSFIAT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystdne, D=Asparnc Add, 
E=Glutamlc Add, F=Phenylalanine t OGlydne, H=Histidine, 
£=>Isoteudne, K=Lysine, L»Lendne, M>°MetbIonine, 
N=Asparaglne, P=Prollne, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=OTyrosine, 
X~Unknown, *°Stop codon,MpossibIe nudeotide deletion, 
V=possible nudeotide insertion 










LVYQNASAQVYYINHY 


3415 


A 


455 


108 


NMSWRGRSTYRPRPRRSLQPPELIGAMLEPTDEE 
PKEEKPPTKSRNFITIXJKREDDSG/SAA*DFKWP 
EPGKPBFQGAMVRPKTGG/CGCEGGY*CQGEDS\P 
KAEHFKMPEAGEGKSQV 


3416 


A 


1 


874 


FFFFQRJU^mEHSGSVSLIJUACDLGWCEDWSCC 
LVQGGGDLVDWQTNHGEDEAGGDTDSVDEAR 
. CKESQQEAQENLREDLCLESFAKDKILQIIEGSER 
EHEETRTKQAALDGEPLGGGQLTAVHLHPSKEQ 
QGQEGGERQRGARTHHWRGWEKGRRVRLRPPS 
GKIJfUI>QPVRKI,GGPTPS/TELro 
A/PATPTYSPAPDTPNPPVRWKCPLPVEPRTRQLC 
RERTRKACPPKPRPPLG1JK5DPTGPVTHHAPPVS 
PTGASGQERRAEPGAVSYAHASATK 


3417 


A 


243 


847 


CLKYMYTYIFCPNCVSYKMKTDHFSLRYLHSSC 

AEDNKSSVDSSGQAAHPSKGKFFPHGTHWGTQC 

RGfflSVLGWQCSCPSTGCRVGLGLAMCQTHAYI 

HTrTTHTHTHTPTOYGAHHTDPLQRWGLGPRVKS 

EAGPLPQLSRDQSHPGPLSPGA SPRSAGLPGWHP 

AHQEPRARGRCARDGLSLQTRLTNKYDIQCCQE 

MRK 


3418 

! ' 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 

AAKITELINKLNFLDEAEKDLATVNSENPFDDPDA 

AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 

FKEVQTrXJYLOTroEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVKITNFITSWRNGLSFCAI 

LWTT/ PDLIDYKSLNPQDIKENNKKAYDGFASIGI 

ELK WQIEENSSKS'i V iCVGr^ V3TDTNSSVDQEJCF 

YAELSDIJKREPELC^Pi: GAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSi^S "ASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDTOICSHTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKTVQHRLLSRQEELKERARVL 

IJEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQUAEARSGVKMSELPSYGE 

M^EKUCERSKASGDENDNTBIDTNEEIPEGFVV 

GGGDELTNLENDUDTPEQNSKLVDUCLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDT13CGNEEKAAITETQRKPS 

EDEVLNKGFKDSVSQYWGELAALENEQKQEDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNALIRRMNQI^LLEKEHDLERRYELLNRE 

LRAMLAIEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3419 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 
AAKJTELINKLNFLDEAEKDLATVNSNPFDDPDA 
AELWFGDPDSEEPITETASPRKTEDSFYNNSYNP 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCystdne, D=Asparn'e Add, 
E^GIntamic Add, ^Phenylalanine, G=Grycine, H=Histidine, 
I=Isoleudne, KFLysine, L=Leudne, ^Methionine, 
N=Aspar*gine, P-Prollne, Q=Glutamine, R-Arginine, S=Serine, 
T-Threonine, V-Vaiine, W=Tryptophan, Y-Tyrosine, 
X«=Unknown, *=Stop codon, /^possible nudeotide deletion, 
V=possible nudeotide insertion 










FKEVQTrHJYLNPFDEPEAFVTIKDSPPQSTKR^ 

RPVDMSK YLYAD SSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVUjRKPNASQS 

LLVWCKEVTKNYRGVKITNFTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SRIJJEPSDMVIXAIPDKLTVMTYLYQIRAHFSGQ 

ELNWQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPFICEETDEQ 

K1XJTLDIGSN1JEKEKIJENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELIOERARVL 

LEQARRDAAlJCAGhnCHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTOEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDIJaKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTOKGNEEKAAITETQRKPS 

EDEVLNKGFKDSVSQYWGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNALIRRMNQLSLIJEKEHDLERRYELLNRE 

LRAMLAIEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLBRTLEQNKG 

KMAKKEEKCVLQ 


3420 


A 


612 


1058 


ENLGPNYSHRLLHHPTFYKKIHKKHHEWTAPIG 
VISLYAHPIEHAVSNMLPVWGP^VMGSHLSSITM 

VTSl^irrr^'ICGY^TLPFLPSF* '7 jjY^HT t^n/ 
QCYGVLGVLDHLIvX, ;*!; IMFKQTKA YERHVLLL i 
GFTPLSESIPDSPK 


3421 


A 


23 


2005 


LLTPCDGRIPGRPSVGAESGSDFQQRRRRRRDPE 

EPEKTELSERELAVAVAVSQENDEENEERWVGP 

IPVEATEAKKRKVLEFERVYLDNI^SASNfyERS 

YMHRDVnTTVVCTKTDrllTASHIXjHVKFWKKffi 

EGIEFVKHFRSHLG VIESIA VS SEGALFCS VGDDK 

AMKVFDVVNFDMINMIJCLGYI^GQCEWIYCPG 

DAISSVAASEKSTGKIFIYDGRGDNQPLHIFDKLH 

TSPLTQIRLNPVYKAWSSDKSGMffiYWTGPPHE 

YKrTKNVNWEYKTDTOLYEFAKOKAYPTSVC^ 

PDGKKIATIGSDRKVTUFRFVTGKLMRVFDESLS 

MFTCLQQMRQQLPDMEFGRRMAVERELEKVDA 

VRLINIVFDETGHFVLYGTMLGIKVINVETNRCV 

RIIXjKQENIRVMQLALFQGIAKKHRAATTIEMKA 

SENPVLQNIQADPTTVCTSFKK^niFYMFTKREPE 

DTKSADSDRDVFNEKPSKEEVMAATQAEGPKRV 

SDSAIIHTSMGDIHTKlJ^VECPKTVa^C^S^ 

GYYNGHTrliRIIKGFMIQTGDPTGTGMGGESIWG 

GEFEDEFHSTLRHDRPYTLSMANAGSNTNGSQFF 

nVVTTPWLDNKjFnVFGRVTKGMEWQRJSMVK 

VNPKTDKPYEDVSIINITVK 


3422 


A 


2486 


433 


FVLVCAPLTWAGARHRRMAASKKPPRVRVNHQ 
DFQLRNLRIIEPNEVTHSGDTGVETDGRMPPKVT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D=Aspartic Acid, 
IMJlutamic Acid, {^Phenylalanine, G=Glycine, H-Histidint, 
^Isoleudne, K=Lysine, L^Leudne, M-Methionint, 
N^Asparaglne, P^Proline, Q=Glutamlne, R^Arglnlnt, S^Serine, 
T-Threonlnc, V«VaUne, W^ryptopnan, Y=Tyrosine, 
XKJnknown, *=Stop eodon, /=possiWe nodeotide ddetion, 
\=possible nudeotide insertion 










SELLRQLRQAMRNSEYVTEPIQAYIIPSGDAHQSE 

YIAPCDCRRAFVSGFIXjSAGTAIITEEHAAMWTD 

GRYFLQAAKQMDSNWTLMKMGLKDTPTQEDW 

LVSVLPEGSRVGVDPLIIPTDYWKKMAKVLRSA 

GHHLIPVKENLVDKIWTDRPERPCKPLLTLGLDY 

TGISWKDKVADIJU.KMAEimVMWFVVTALDEI 

A WLFNLRG SD VEHNP VFFS YAIIGLETIMLFIDGD 

RIDAPSVKEHLLLDLGLEAEYRIQVHPYKSILSEL 

KALCADLSPREKVWVSDKASYAVSETIPKDHRC 

CMPYTPICIAKA\VKNSA\ESEGMRRAHIKDAVAL 

CELFNWLEKEVPKGGVTEISAADKAEEFRRQQA 

DFVDLSFPTISSTGPNGAIIHYAPVPETNRTLSLDE 

VYLIDSGAQYKDGTTDVTRTMHFGTPTAYEKEC 

FTYVLKGHIAVSAAVFPTGTKGHLLDSFARSAL 

WDSGIJ3YLHGTGHGVGSFLNVHEGPCGISYKTF 

SDEPLEAGMIVTDEPGYYEDGAFGIRIENVVLVV 

PVKTKYNFNNRGSLTFEPLTLVPIQTKMIDVDSL 

TDKECDWLNNYHLTCRDVIGKELQKQGRQEAL 

EWLIRETQPISKQH 


3423 


A 


5515 


934 


FKMPENPATDKLQVLQVLDRLKMKLQEKGDTS 

QNEKLSMFYETLKSPLFNQILTLQQSIKQLKGQL 

NHIPSDCSANFDFSRKGLLVFTDGSITNGNVHRPS 

NNSTVSGLFPWTPKLGNEDFNSV1QQMAQGRQIE 

YIDffiRPSTGGLGFSVVALRSQNLGKVDIFVKDV 

QPGSVADRDQRLKENDQILAINHTPLDQN1SHQQ 

AIALLQQTTGSLRLIVAREPVHTKSSTSSSLNDTT 

LPETVCWGHVEEVELINDGSGLGFGIVGGKTSGV 

WRUVPGGLADRDGRLQTGDEQLKIGGTNVQG 

MTSEQVAQVLRNCGNSVRMLVARDPAGDISVTP 

PAPAALPVALPTVASKGPGSDSSLFETYNVELVR /.- 

i^K QSLC^ir sr/GTShrGE^c-'VKsiiPos/: 

AYHNGHIQ -^i/ki v AVDGVNIQGFANHDWEVL 

RNAGQWffi. (LVRRKTSSSTSPLEPPSDRGTWE 

PLKPPALFLTGA V :TIKTNVDGEDEEIKERIDTLKN 

DNIQALEKLEKVPDSIENELKSRWENLLGPDYEV 

MVATLDTQIADDAELQKYSKLLPIHTLRLGVEV 

DSFDGHHYISSIVSGGPVDTLGLLQPEDELLEVN 

GMQLYGKSRREAVSFLKEVPPPFTLVCCRRLFDD 

EASVDEPRRTETSLPETEVDHNMDVNTEEDDDG 

ELALWSPEVKTVELVKDCKGLGFSILDYQDPLDP 

TRSVIVIRSLVADGVAERSGGLLPGDRLVSVNEY 

CLDNTSLAEAVEILKAWPGLVHLGICKPLVEDN 

EEESCnLHSSSNEDKTEFSGTIHDINSSLILEAPK 

GFRDEPYFKEELVDEPFLDLGKSFHSQQKEEEQS 

KEAWEMHEFLTPRLQEMDEEREMLVDEEYELY 

QDPSPSMELYPLSHIQEATPVPSVNELHFGTQWL 

HDNEPSESQEARTGRTVYSQEAQPYGYCPENVM 

KENFVMESLPSVPSTEGNSQQGRFDDLENLNSLA 

KTSLDLGMIPNDVQGPSLLIDLPVVAQRREQEDL 

PLYQHQATRVISKASAYTGMLSSRYATDTCELPE 

REEGEGEETPNFSHWGPPRIVEIFREPNVSLGISrV 

GGQTVIKRIJKNGEELKGIFIKQVLEDSPAGKTNA 

LKTGDKILEVSGVDLQNASHSEAVEAIKNAGNP 

VVFIVQSI^SIPRVIPNVHNKANKITGNQNQDTQ 

EKKEKRQGTAPPPMKLPPPYKALTDDSDENEEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=Glntamlc Add, ^Phenylalanine, G=Glydne, R=Histidinc, 
I^lsole urine, K«Lysine, L^Leurine, M=Methtonine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine> 
T^Threonlne, V-Vallne, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon,/=possJble nudeotide ddetion, 
V-possible nudeotide insertion 










DAFITXJKIRQRYADLPGELHIIELEKDKNGLGLS 

LAGNKDRSRMSIFWGINPEGPAAADGRMHIGD 

EIXEINNQILYGRSHQNVASAIIKTAPSKVKLVFIR 

NEDAVNQMAVTPFPVPSSSPSSIEDQSGTEPISSEE 

XDGSI^VGIKQIJESESFKLAVSQMKQQKYPTKV 

SFSSQEIPLAPASSYHSTDADFTGYGGFQAPLSVD 

PATCPIVPGQEMIIEISKRRSGLGLSrVGGKDTPLV 

NG VDLRNS SHEEAITALRQTPQKVRL VVYRDEA 

HYRDEENLEIFPVDLQKKAGRGLGLSIVGKR 


3424 


A 


2223 


1162 


HASERWQLPDFVWDQYTHSIXjRVEREFKXRKR 

HTRR VKL VFDKGLPARPKSPLDPKKDGESLS YS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYFIKLTKAGIPVPGRTPNLYI 

YSKXSSTSRRQHPLNKHLFKP\GTFNfTSHEPPVY 

MDEDDDRSCTHSHMNTAVEDASDDESIPIMYRN 

IJ>EYKELLQFKXLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCR\DCPPVEMSL\DFCVDS 

C\SDCLHET\DIHKGDHQLEPIYRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3425 


A 


2223 


1162 


hasejflwqlpdfvwdqythslgrverefknrkr 

htrrvklvfdkglparpkspldpkkdgeslsys 

mlplsdgpegsssrpqmirgrlcddtkpetfnql 

wtveeqkkleqixikyppeevesrrwqk1adelg 

nrtakqvasrvqkyfikltkagipvpgrtpnlyi 

yskksstsrrqhplnkhlfkp\gtfmtsheppvy 

mdedddrscfhshmntavedasddesip1myrn 

ijeykellqfkklkkqklqhmqaesgfvqhvgf 

kcdncgebpiqg\vrw\hcr\dcpp\emsl\dfads 

c\sdcltot\dihkgdhqlepiyrs\etfldrdycv 

sqgtsy «dp: : y : ?ant. 




A 


2 


1553 


lu^VVVRDDPRWGTPRYW: HALx^QQSSFTAPP 

GIJLPLEYFPAAPHCSHSRQW<CSQTHRIHHHPQ 

MLGPCRQEICGITMAAGTL YIYP; NWRAFKALI 

AAQYSGAQVRVI^APPHFHFGQTNRiPEFLRKFP 

AGKVPAFEGDDGFCVFESNAIAYYVSNEELRGST 

PEAAAQWQWVSFADSDIVPPASTWVFPTLGIM 

HHNKQATENAKEEVRRILGLLDAYLKTRTFLVG 

ERVTLADITWCTLLWLYKQVLEPSFRQAFPNTN 

RWFLTCINQPQFRA\VFGEVKLCEKMAQF\DAKK 

FAETQPKKDTPRKEKGSREEKQKPQAERKEEKK 

AAAPAPEEEMDECEQALAAEPKAKDPFAHLPKS 

TFVLDEFKRIC^SNEDTLSVALPYFWEHFDKDGW 

SLWYSEYRFPEELTQTFMSCNLITGMFQRLDKLR 

KNAFASVILFGTNNSSSISGVWVFRGQELAFPLSP 

DWQVDYESYTWRKLDPGSEETQTLVREYFSWE 

GAFQHVGKAFNQGKIFK 


3427 


A 


755 


52 


TAARRRQKGTAARRRQKGTAARRRQKGTAARR 

RQKGTAARRRQKGTAARRRQKGTAARRRQKGT 

AARRRQKGTAARRRQKGTAARRRQKGTAARRR 

QKGLSNU)AAEWLPPKKG\GEKKKGPFLAINEV 

VT^YPIMLKRIHGVGFKKRAPRAIJG^ 

KEMGTPDVRIDTRLNKAVWAKGIRNVPYRIRVR 

LSRKRNEDEDSPNKLYTLVTYVPVTTFKNLQTV 

NVDEN 
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SEQID 
NO: , 


Metfaod ' 


Predicted 
beginning 
nnrftnride 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alantae OCysteine, D=>Aspartic Acid, 
EXHutamic Add, ^Phenylalanine, (^Glycine, R^Hlstidine, 
I—Isolencine, Kf Lysine^ L^Leucine, M^MebMonlne* 
N=Asparagtne, P*=Proline, Q=GluUmIne, RpArginine, S=Serine, 
"^Threonine, V-Valine, W-Tryptophan, Y=Tyrosinc, 
X=Un known, *«Stop codon, A^possible nucleotide deletion, 
V*possible nucleotide insertion 


3428 


A 


4 


1939 


LPLSLSFSEMPLPLLPMDLKGEPGPPGKPGPWGP 

PGPPGFPGKPGHGKPGLHGQPGPAGPPGFSRMG 

KAGPPGLPGNVGPPGQPGLRGEPGIRGDQGLRGP 

PGPPGLPGPSG1TIPGKPGAQGVPGPPGFQGEPGP 

QGEPGPPGDRGLKGDNGVGQPGLPGAPGQGGAP 

GPPGLPGPAGLGKPGLDGLPGAPGDKGESGPPG 

VPGPRGEPGAVGPKGPPGVDGVGVPGAAGLPGP 

QGPSGAKGEPGTRGPPGLIGPTGYGMPGLPGPKG 

DRGPAGVPGLLGDRGEPGEDGEPGEQGPQGLGG 

PPGLPGSAGLPGRRGPPGPKGEAGPGGPPGVPGI 

RGDQGPSGLAGKPGVPGERGLPGAHGPPGPTGP 

KGEPGFTGRPGGPGVAGALGQKGDLGLPGQPGL 

RGPSGIPGLQGPAGPIGPQGLPGLKGEPGLPGPPG 

EGRAGEPGTAGP\RGPPGVPGSPGITGPPG\LPGPP 

GAPGAFDETGIAGLHLPNGGVEGAVLGKGGKPQ 

FG1X}ELSAHATPAITAVLTSPLPASGMPVKFDRT 

LYNGHSGYNPATGIFTCPVGGVYYFAYHVHVKG 

TNVWVALYKNNVPATYTYDEYKKGYLDQASG 

GAVLQLRPNDQVWVQMPSDQANGLYSTEYIHSS 

FSGFLLCPT 


3429 


A 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRW 

AGPESLPPLPRSLIMDSPRAGTHQGPLDAETEVG 

ADRCTSTAYQEQRPQVEQVGKQAPLSPGLPAMG 

GPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCAF 

TVALRARRGADLSSLRALLGQALPHQ\AQLGQLS 

YLAPGEDGHWVPIPEEESLQRAWQDAAACPRGL 

QLQCRGAGGRPVLYQWAQHSYSAQGPEDLGF 

RQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCFV 

VPAGPRMSGAPGRLPRSQQGDQP 


3430 


. A 


799 


1989 


INK YINIRKKDCLLSPLPPL WSHL ALLO AS ATKWV 

cisA/jFAGMmy^jOVLr" wrsia a,: ^/u. 

* fTWLLATKRRKQQLVLi . ^DiHlCEEL^DPPLPTT 
rTSV>TV r HFTRQCNYKCGFCFHTAKTSFVLPLEEA 
KRGLLLLK\EAG\LEKINFSGG\EPFLQDRGEYLGK 
LA^RFCKVELRLPSVSI\VSNGSLIRERWFQNYG\E 
YLDILAISCDSFDEEVNCMGRGMGKKNHVENL 
QKL\RRWCRDYRVrTKmSVINPHNVEEDMTEQI 
KALNPVRWKVFQCLLIEGENCGEDAVLREAERFV 
IGDEEFERFLERHKEVSCLVPESNQKMKDSYUL 
DEYMRFLNCRKGRKDPSKSELDVGVEEAIKFSGF 
DEKMFLKRGGKYIWSKADLKLDW 


3431 


A 


5468 


2146 


ACGFLPGRCHFSTFKQCQEWLSRLSRATARPAKP 

EDLFAFAYHAWCLGLTEEDQHTHLCQPGEHIRC 

RQEAELARMGFDLQNVWRVSHINSNYKLCPSYP 

QKLLVPVWITDKELE^ASFRSWKRIPVVVYRH 

LRNGAAIARCSQPEISWWGWRNADDEYLVTSIA 

KACALDPGTRATGGSLSTGNNDTSEACDADFDS 

SLTACSGVESTAAPQKLLILDARSYTAAVANRAK 

GGGCECEEYYPNCEVVFMGMANIHAIRNSFQYL 

RAVCSQMPDPSNWLSALESTKWLQHLSVMLKA 

AVLVANTVDREGRPVLVHCSDGWDRTPQIVALA 

KILLDPYYRTLEGFQVLVESDWLDFGHKFGDRC 

GHQENVEDQNEQCPVFLQWLDSVHQLLKQFPCL 

FEI^AFLVKLVQHTYSCLYGTrXANNPOEREK 

RNIYK/RGTCSVWALLRAGNKNFHNFLYTPSSD 
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SJEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIctamic Add, ^Phenylalanine, G^Glydne, H^Histidine. 
I=Isolenetoe> K°Lysfne, L^Leudne, M=Metbionlne, 
N=Asparagine, P»Prollne, Q=Glutamine, R=»Arginlne, S=Serine, 
T-Threonine, V=Valine, W«Tryptophan, Y»TyrosIne, 
X=Un known, *=Stop codon, A=possible nucleotide deletion, 
V-possible oudeotide insertion 










MVLHPVCHVRAIJHLWAVYLPASSPCTLXjEEN 

MDLYLSPVAQSQEFSGRSLDRLPKTE^SMDDLLS 

ACDTSSPLTRTSSDPNLNNHCQEVRVGLEPWHS 

NPEGSETSFVDSGVGGPQQTVGEVGLPPPLPSSQ 

KDYLSNKPFKSHKSCSPSYK1XNTAVPREMKSNT 

SDPEDCVLEETKGPAPDPSAQDELGRnJDGIGEPP 

mCTETEAVSALSKVISNKCDGVCNFPESSQNSPT 

GTPQQAQPDSMLGVPSKCVLDHSLSTVCNPPSA 

ACQTPLDPSTDF\LNQDPSGSVAS1SHQEQLSSVP 

DLTHGEEDIGKRGNNRNGQLLENPRFGKMPLEL 

VRKPISQSQISEFSFLGSNWDSFQGMVTSFPSGEA 

TPRRLLSYGCCSKRPNSKQMRATGPCFGGQWAQ 

REGVKSPVCSSHSNGHCTGPGGKNQMWLSSHPK 

QVSSTKPVPLNCPSPVPPLYLDDDGLPFPTDVIQH 

RIJIQIBAGYKQEVEQLRRQVRELQMRLDIRHCC 

APPAEPPMDYEDDFTCLKESDGSDTEDFGSDHSE 

IXXSEASWEPVDKKETEVTRWVPDHMASHCYN 

CDCEFWLAKRRHHCRNCGNVFCAGCCHLKLPIP 

DQQLYDPVL VCNSCYEHIQ V SRARELMSQQLKK 

PIATASS 


3432 


A 


36 


1873 


MTFFSSVADFIGLDPRIAAWLIDPSDATPSFEDLV 

EKYCEKSITVKVNSTYGNSSRNIVNQNVRENLKT 

LYRLTMDLCSKLKDYGLWQLFTITLELPLIPILAV 

MESHAIQVNKEEMEKTSALLGARLKELEQEAHF 

VAGERFLITSNNQLREILFGKLKLHLLSQRNSLPR 

TGLQKYPSTVSEALNALRDLHPLPKJILEYRQVH 

KIKSTFWGLLACMKKGSJSSTWNQTGTVTGRLS 

AKHPNIQGISKHPIQITTPKNFKGKEDKILTISPRA 

MFVSSKGHTFIAADFSQffiLRILTHLSGDPELLKL 

FQESERDD VFSTLTSQ WKD VPVEQVTHA DREQT 

KKVVYA' TY l ^-IY "HLA A C: ™* T QEA*'SSL' ;? 

- 7 ^QKYKisJKDFARAAIAQc>:,QVw^ VSIMGRRK 

PLPRJHAHDQQLRAQAERQa 1 .^fwqgsaadlc 

KLAMfflVFTAVAASHTLTARL VAOJHDELLFEVE 

DPQIPECAALVRRTMESLEQVPLKVSLSAGRSWG 

HLVPLQEAW\ALRQAHVALSLPATAWLPLGPLP 

APSPHPCIFRLHFVCSPRQQWEERTGFQQSrVWPS 

PRSPALYAPGRINPLGLGWPAIPWSKCLCKALKK 

K 


3433 


A 


1481 


476 


IPPKERAPGIRASCLAITAGARPTSYGRVGCEGDV 

RLSPVSPLLAPPDPRLASRWEGRSRMKGKKGIVA 

ASGSETEDEDSMDIPLDLSSSAGSGKRRRRGNLP 

KESVQILRDWLYEHRYNAYPSEQEKALLSQQTH 

LSTLQVO^WFINARRRLLPDMLRKIX5KDPNQFTr 

SRRGAKISETSSVESVMGIKNFMPALEETPFHSFTA 

AGPNPTLGVRPLSAKP/SQSPGSVLARPSVICHTTV 

TAIERLSLSLSCQSVGCGQNT\DIQQIA'ARNLRDS 

SLMYPEDTCKSGPSTNTQSGLFNTPPPTPPDLNQ 

DFSGFQLLVDVALKRAAEMELQAKLTA 


3434 


A 


1720 


1243 


NGPVPPGGSKTKWAGGSAAEGSPRLSPSPGAAQ 

VPALLRGEPRGGAAAGSFWKPLHQHSCGLRPPP/ 

PPD/RLSRLPGKTLSACDRENGARRPLLLGSTSFIP 

IGRRTYASAAEPVGSKAVLVTGCDSGFGFSLAKH 

LHSKGFLVFAGCLMKDKGHDGVKELDSLNSDRL 

RTVQLNVCSSEEVEKV/V GDCPLEPEGPVEKGMW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-Alanine OCystdne, IHAspartic Add, 
E^Glotamfc Add, F^Pbenylalanine, G=GIycine, H^Histidine, 

N=^Asparagine, P^Proiine, Q=GIntamIne, R=Arginine, S^Serine, 
T^Threonlne, V=VaJlne, W=*Tryptophan, Y«Tyrosine, 
X«Unknown f *=Stop codon,/=possible nucleotide deletion, 
V^possibie nudeotide insertion 










GLVNNAGISTFGEVEFTSLETYKQVAEVNLWGT 

A^RMTKSFIJIJRRAKGRVVMSSMIXSRMANPAR 

SPYCITKFGVEAFSDCLRYEMYPLGVKVSVVEPG 

hMAATSLYSPESIQAIAKKMWEELPEVVRKDYG 

KKYFDEKIAKMETYCSSGSTDTSPVIDAVTHALT 

ATTPYTRYHPMDYYWWLRMQIMTHLPGAISDM 

IYIR 


3435 


A 


842 


3595 

\ 


ENQQQMLVAKEQRLHFLKQQERRQQQSISENEK 
LQKLKERVEAQENKLKKIRAMRGQVDYSKIMN 
GMLSAEIERFSAMFQEKKQEVQTAILRVDQLSQQ 
LEDLKKGKLNGFQSYNGKLTGPAAVELKRLYQE 
LQIRNQLNQBQNSKLQQQKELLNKRNMEVANfM 
DKWSELRERLYGKKIQACEKVFLNRVNGTSSPQ 
SPLSTSGRVAAVGPYIQVPSAGSFPVLGDPIKPQS 
LSIASNAAHGK^KSANIXjNWPTLKQNSSSSVKP 
VQVAGADWKDPSVEGSVKQGTVSSQPVPFSALG 
PIEKPGIEIGKVPPPIPGVGKQLPPSYGTYPSFrPL 
GPGSTSSLERRKEGSLPRPSAGLPSRQRPTLLPAT 
GSTPQPGSSQQIQQRISVPPSPTYPPAGPPAFPAGD 
SKPELPLTVAIRPFLADKGSRPQSPRKGPQTVNSS 
SIYSMYLQQATPPKNYQPAAHSALNKSVKAVYG 
KPVLPSGSTSPSPLPFLHGSLSTGTPQPQPPSESTE 
KEPEQIXjPAAPAIXjSTVESLPRPLSPTKLTPIVHS 
PLRYQSDADLEALRRKLANAPRPLKKRSSITEPE 
GPGGPN1QKLLYQRFNTLAGGMEGTPFYQPSPSQ 
DFMVTLADVDNGNTNANGNLEELPPAQPTAPLP 
AEPAPSSDANDNELPSPEPEELICPQTTHQTAEPA 
EDNNNNVATVPTTEQIPSPVAEAPSPGEEQVPPA 
PLPPASHPPATSTNKRTNLKKPNSERTGHGLRVR 
FNPI , ALLLDASLEGEFDL VQRIIYE VEDPSKPNDE 
GTTPLHNA : /CA jHHHIVaG'LLE r J V NT ~ * A ADSD ! 
X3WTPLHCAASCNSVHLCKQLVESGA . aSTISD 
IETAADKCEEMEEGY1QCSQFLYGVQEKLGVMN 
KGVAYALWDYEAQNSDELSFHEGDALITLRRKD 
E 


3436 


A 


3 


2604 


GSTHASEKMKTGRSALVVTDTGDMSVLNSPRHQ 

SOMrT^MIX^VSVGIRlWDLKGKPVAVTS^ 

RGTGRAPLRPGANPQLEWQYYQNKILKGKADIP 

DSSLWENPDSAQANGIDSVLSRAEIASCSYEARQ 

LGIKNGMFFGHAKQIXTNLQAVPYDFHAYKEVA 

QTLYETIAS\YTHMEAVSC3)EALVDITEILAETK 

LTPDEFANAVRMEIKDQTKCAASVGIGSNILLAR 

MATRKAKPDGQYHLKPEEVDDFIRGQLVTNLPG 

VGHSMESKLASLGIKTCGDLQYMTMAKLQKEF 

GPKTGQMLYRFCRGLDDRPVRTEKERKSVSAEI 

NYGIRFTQPKEAEAFLLSLSEEIQRRLEATGMKG 

KRLTLKIMVRKPGAPVETAKFGGHGICDNIARTV 

TUXJATDNAICnGKAMLNMFHTMKLNISDMRGV 

GIHVNQLVPTNLNPSTCPSRPSVQSSHFPSGSYSV 

RDWQVQKAKKSTEEEHKEVFRAAVDLEISSASR 

TCITLPPFPAHLPTSPDTbTKAESSGKWNGLHTPV 

SVQSRLNLSIEVPSPSQLDQSVLEALPPDLREQVE 

QVCAVQQAESHGDKKKEPVNGCNTGILPQPVGT 

VLLQIPEPQESNSDAGINLIALPAFSQVDPEVFAA 

LPAELQRELKAAYDQRQRQGENSTHQQSASASV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanlne OCysteine, D=Aspartk Add, 
BMJlntamic Add, ^Phenylalanine, G-Glycine, R=Hfetidinc, 
I^Isofaidne, K=°Lysine, LHLeudne, M=MethJonine, 
N=Asparagine, P=Proline, Q=Glntamine, R=Arginine, S=Sertne, 
T=Threonine, V=Vallne, W=Tryptophan, Y=Tyrosiiie, 
X=*Unknown, *" a Stop codon, A*possiDle nodeotide deletion, 
V=possible nudeotlde insertion 










PKNPIXHIiCAAVKEKKRNKKKKTO 
NNKIXNSPAKTLJH3ACGS 

EKPLEELSASTSGVPGLSSLQSDPAGCVRPPAPNL 
AGAX^FNDVKTIXREWITTISDPMEEDILQVVKY 
CTDLIEEKDIJEKLDLVIKYNlKRIiMQQSVESVWN 
MAFDFELDNVQWLQQTY GSTLKVT 


3437 


A 


32 


4038 


SLLRLLKAQWGSSGAASEPWLGEEGCGFPSTNE 
YPDLEEERATYPQEEDRFLTPGRAQLLWSPWSPL 
IXJEEACASRQIJISIASFSTVTARRNPLHNPWGM 
EIJ^SENTDSPSPRPLRPGVTIJ>PGALTM^^ 
TEVAENSHHLKIFLPKKLXECLPRCPLLPPERLRW 
NTNEEIASYLITFEKHDEWLSCAPKTRPQNGSIIL 
YNRKKVKYRKDGYLWKKRKIX3KTTREDHMKL 
B^QGMECLYGCYVHSSIVFITHRRCYWLLQNPD 
IVLVHYLNVPAlJEDCGKGCSPirCSISSDRREWLK 
WSREELIXjQLKPMFHGIKWSCGNGTEEFSVEHL 
VQQILDTHPTKPAPRTHACLCSGGLGSGSLTHKC 
SSTKHRII SPKVEPRALTLTSBPHPHPPEPPPLIAPLP 
PELPKAHTSPSSSSSSSSSGFAEPLEIRPSPPTSRGG 
SSRGGTAILLLTGLEQRAGGLTPTRHLAPQADPR 
PSMSLAVWGTEPSAPPAPPSPAFDPDRFLNSPQR 
GQTYGGGQGVSPDFPEAEAAHTPCSALEPAAAL 
EPQAAARGPPPQSVAGGRRGNCFFIQDDDSGEEL 
KGHGAAPPEPSPPPSPPPSPAPLEPSSRVGRGEALF 
GGPVGASELEPFSLSSFPDLMGELISDEAPSEPAPT 
PQLSPALSTITDFSPEWSYPEGGVKVLITGPWTEA 
AEHYSC VFDHIAVPASLVQPG VLRCY CPAHEVG 
LVSLQVAGREGPLSASVLFEYRARRFLSLPSTQL 
DWLSLDDNQFRMSILERLEQMEKRMAEIAAAGQ 
VPCQGPDAPPVQDHGQGPGFEARVWLVESMIP 
P.ST7.T/ 7 7"^ • ;AHG3rF?vCM^ILI?LAAAQGY/_RL 
IETLuVv»^ V i\TGSLDLEQEVDPLNVDHFSCTPL 
MWACA V-GHLEAAVLLFRWNRQALSIPDSLGRLP 
LSVAHSRGl^VRLARCLEELQRQEPSVEPPFALSP 
PSSSPDTGLSSVSSPSELSDGTFSVTSAYSSAPDGS 
PPPAPLPASEMTMEDMAPGQLSSGVPEAPLLLM 
DYEATOSKGPLSSLPALPPASDDGAAPEDADSPQ 
AVDVIPVDMISLAKQ1IEATPERIKREDFVGLPEA 
GASMRERTGAVGLSETMSWLASYLVENVDHFPS 
STPPSEL\PFER\GRLGLSLTAPSWAEFLSCIPPVGK 
. IGKLIFALLTL\SD\QEQRELYEAARVIQTAFRKYK 
GRRLKEQQEVAAAVIQRCYRKYKQLTWIALKFA 
LYKKMTQAAILIQSKFRSYYEQKRFQQSRRAAV 
LIQQHYRSYRRRPGPPHRTSATLPARNKGSFLTK 
KQDQAARK1MRFLRRCRHRMRELKQNQELEGLP 
QPGLAT 


3438 


A 


469 


2602 


FGRLLWGTAFKSWKMKAPIPHLILLYATFTQSLK 

VVTKRGSAlXjCTDWSmiKKYQVLVGEPVRIKC 

ALFY GYIRTNYSLAQSAGLSLMWYKSSGPGDFE 

EP1AFDGSRMSKEEDSIWFRPTLLQDSGLYACVIR 

NSTYCMKVSISLTVGENDTGLCYNSKMKYFEKA 

ELSKSKEISCRDIEDFLLPTREPEILWYKECR'nCT 

WRPSIVFKRDTLLIREVREDDIGNYTCELKYGGF 

VVRRTTELTVTAPLTDKPPKLLYPMESKLTIQET 

QLGDSANLTCRAFFGYSGDVSPLIYWMKGEKFTE 



335 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCIYUS01/04098 



SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteioe, I>=Aspartic Acid, 
E-=Glutamic Add, ^Phenylalanine, G=€lydiie, H^EQstidlne, 
I^Xsoleudne, K— Lysine, L=Leudne, M^Methio nine, 
N=Asparagine, P^Prollne, Q=Glutamine, R^Arginine, S-Serinc, 
^Threonine, V=Vaiine, W"=Tryptophan, Y^ryrosine, 
X-Unknown, *«Stop codon, /=possib!e nndeotide deletion, 
\=possibte nudeotide insertion 










DLDENRVWESDIXKJLKEHLGEQEVSISLIVDSVEE 

GDIX3NYSCYVENGNGRRHASVIXHKRBLMYTV 

ELAGGLGAILLLLVCLVTIYKCYKIEIMLFYRNHF 

GAEELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFAI^EPDMLEKHYGYKLFIPDRDLIPTGTYI 

EDVARCVDQSKIU.IIVMTPNYVVRRGWSIFELET 

RLRNMLVTOEIKVILIECSELRGIMNYQEVEALK 

HTIKLLTVIKWHGPKCNKLNSKFWKRLQYEMPF 

KRIEPITHEQALDVSEQGPFGELQTVSAISMAAAT 

STALATAHPDLRSTFHNTYHSQMRQKHYYRSYE 

YDWPTGTLPLTSIGNQHTYCMPMTLINGQRPQT 

KSSREQNPDEAHTNSAILPLLPRETSISSVIW 


3439 


A 


251 


2037 


GPGNSSILIGGGHLJ^IRSCXNLLLI^SKENTEHT 

MAKKVAVIGAGVSGLSSIKCCVDEDLEPTCFERS 

DDIGGLWKFIERGSSLSVMIWPIjU^IIJ(HGGFC 

YSDFPFHEDYPNFMNHEKFWDYLQEFAEHFDLL 

KYIQFKTTVCGITKRPDFSETGQWDVVTETEGKQ 

NRAVFDAVMVCTGHFLNPHLPLEAITCIHKFKG 

QILHSQEYKIPEGFQGKRVLVIGLGNTGGDIAVEL 

SRTAAQVLLSTRTGTWVLGRSSDWGYPYNMMV 

TRRCCSFIAQVLPSRFLNWIQERKLNKRFNHEDY 

GLSITKGKKAKFIVNDELPNCILCGArrMKTSVIE 

FTETSAVFEDGTVEEhnDVVIFTTGYTFSFPFFEEP 

IJCSLCTKKIFLYKQVFPLNLERATLAIIGLIGLKGS 

ILSGTELQARWVTRVFKGLCKRPASQKLMMEAT 

EKEQLIKRGVFKDTSKDKFDYIAYMDDIAACIGT 

KPSEPLLFLKDPRLAWEVFFGPCTPYQYRVLMGPG 

KWDGARNACLTQWDRTLKPLKTRIVPDSSKAWP 

SM\SHYLKAWGAPVLLASLLLICK\SSLFLKLVRD 

KLQDRMSPYLVSLWRG 


, 3440 - 


• ** 

• 

■ 




3533 


iMPccr.SRLLRc .;iirm:-vsDi^.:^".^3SVKi 

ENS:-, 'LGESMAGISQNAKTGDLP/i5GEcvCIASK 

ALCGLTEAAAQAAYLVGIFDPNSQAC ^[QGLVDP 

IQFARANQAIQMACQNLVDPGSSPSQVLf;AATIV 

AKHTSALCNACRIASSKTANPVAKRHFVQS/iXE 

VANSTANLVKTIKALIXjDFSEDKRNKCRIATAPL 

IEAVENLTAFASNPEFVSIPAQISSEGSQAQEPILV 

SAKPMLESSSYLIRTARSIJUNPKDPPTWSVLAG 

HSHTVSDSKSLITSIRDKAPGQRECDYSIDGINRC 

IRDIEQASLAAVSQSLATRDDISVEALQEQLTSW 

QEIGHLIDPUTAARGEAAQLGHKGTQLASYFEP 

LIIAAVGVASKIIJDHQQQMTV^^ 

QMLYAAKEGGGWKAQHTHDAITEAAQLMKEA 

VDDIMVTLNEAASEVGLVGGMVDAIAEAMSKL 

DEGTPPEPKGTFVDYQTTWKYSKAIAVTAQEM 

MTKS\TIWEELGGLASQMTSDYGHLAFQGQMA 

AATAEPEEIGFQIRTRVQDLGHGCIFLVQKAGUL 

QVCPTDSYTKRELIECARAVTEKVSLVLSALQAG 

NKGTQACITAATAVSGIIADIJDTTIMFATAGTLN 

AENSETFADHRENILKTAKALVEDTKLLVSGAAS 

TPDKLAQAAQSSAATTTQLAEVVKLGAASLGSD 

DPETQWLINAIKDVAKALSDLISATKGAASKPV 

DDPSMYQLKGAAKVMVIWrreiJLKTVKAVEDE 

ATRGTRALEATBBCIKQELTVFQSKDVPEKTSSPE 

ESIRMTKGITMATAKAVAAGNSCRQEDVIATAN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-*Alanine C=€ysteiue, D»Aspartic Add, 
E«=Glutamic Add, ^Phenylalanine, OGhyrine, H=Hlstidint, 
I-Isoleudne, K=Lyslne, L=L*udne, M^Methlonine, 
N^Asparagfoe, P*ProIine, Q»GIutamine, R-Arginlne, S=Serine, 
T^Threoninc, V-Vallne, W~Tryptophan, Y-Tyrosinc, 
X=Un known, *^Stop codon, ^possible nndeotide deletion, 
V^posable oudeotide insertion 










LSRKAVSDMLTACKQASFHPDVSDEVRTRALRF 

GTEXHXGYLDLLEHVLVILQKPTPELKQQLAAFS 

KRVAGAVTELIQAAEAMKGTEWVDPEDPTVIAE 

TELLGAAASIEAAAKKLEQLKPRAKPKQADETL 

DFEEQILEAAKSIAAATSALVKSASAAQRELVAQ 

GKVGSIPANAADDGQWSQGLISAARMVAAATSS 

LCEAANASVQGHASEEKLISSAKQVAASTAQLL 

VACKVKADQDSEAMRRLQAAGNAVKRASDNL 

VRAAQKAAEGKADDDDVVVKTKFVGGIAQIIAA 

QEEMLKKEREIJiEARKKljVQIRQQQ^ 

REDEG 


3441 


A 


3 


1584 


NSARGGVGVRGARAMATVQEKAAALNLSALHS 

PAHRPPGFSVAQKPFGATYWSSIimLQTQVEV 

KKRRHRLIOIHNDCFVGSEAVDVIFSHLIQNKYF 

GDVDIPRAKVVRVCQALMDYKVFEAVPTKVFG 

KDKKIHTEDSSCSLYRFITIPWQDSQLGKENKLY 

SPARYADALFKSSDIRSASLEDLWENLSLKPANS 

PHVNISTTLSPQVINEVWQEETIGRLLQLVDLPLL 

DSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGIL 

KAYSDSQEDEWLSAAIDCLEYLPDQMWEISRSF 

PEQPDRTDLVKELLFDAIGRYYSSREPLLNHLSD 

VHNGIAELLVNGKTEIALEATQLLLKLLDFQNRE 

EFRRLLYFMAVAANPSEFKLQKESDNRMVVKRI 

FSKAIVDNKNLSKGKTDLLVLFL\MDHQKDVFKI 

PGTL\HKIVS\VK\LMAIQNGRDPNRDAGYIYCQRI 

IXJRDYSNITEKTTIDELLYLLKTLDEDSKLSAKE 

KKK\LLGQFYKCHPDIFIEHFGD 


3442 


A 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVQTAAQQ 
VAEDKFVFDLPDYESINHVVVFMLGTIPFPEGMG 
GSVYFSYPDSNGMPVWQLLGFVTNGKPSAIFKIS 
' T JCS^7r Q- r ?FGAI J^iV-Tr>c'VAC3IGISV^LL T ?: 
MAQQ iVViii^ WSSVDSFTQFTQKMLDNF YNi- 
ASSFAV AO>DDTQ/RPSEMFIPANVVLKWYENF 
QRRTSTEP2LI BNIIWIKINF 


3443 


A 


3 


1373 


SWHVRRRWKlLVIMAGGMKVAVSPAVGPGPWG 

SGVGGGGTVRLLLELSGCLVYGTAETDVNWML 

QESQVCEKRASQQFCYTNVLIPQWHDIWTRIQIR 

VNSSRLVRVTQVENEEKLKELEQFSIWNFFSSFL 

KEKOTOTYVNVGLYSTKTClJCVEnEKDTKYSVI 

VIRRFDPKLFL\aa,LGLMLFFCGDLLSRSQIFYYS 

TGMTVGIVASL\LIIIFa^KFMPKKSPrYVlLVGGW 

SFSLYLIQLVFKNLQEIWRCYWQYLLSYVLTVGF 

MSFAVCYKYGPLENERSINLLTWTLQLMGLCFM 

YSGIQIPHIALAiraALCriXNLEHPlQWLYrrCRKV 

CKGAEKPVPPRLLTEEEYRIQGEVETRKALEELR 

EFCNSPDCSAWKTVSRIQSPKRFADFVEGSSHLT 

PNEVSVHEQEYGLGSI1AQDEIYEEASSEEEDSYS 

RCPAITQNNFLT 


3444 


A 


566 


1718 


KGIJERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPGVGSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSnFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNYEVLIYVroVE^ 

ELEKDMHYYQSCLEAIIXJNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 
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Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 



Amino add sequence (A=Alanine OCystdne, D=Aspartic Add, 
E=Glutamlc Add, ^Phenylalanine, OGrycint, H=Hlstidine, 
I=4soleudne, K=Lyslne, I/-Leudne, M=MethionIne, 
N=»Asparagine, P=**rotine» Q=Glutamine, R=Argininc, S=Serine, 
T«Threonine, V=Valine, W=Tryptophan, Y^Tyroslne, 
X=Unbnown, *=Stop codon, ^possible nudeotide deletion, 
V=posslble nudeotide Insertion 



SEQID 
NO: 



DETLYKAWSSIVYQUPNVQQLEMNLRNFAEIIE 
ADBVIXFERATFLVISHYQCKEQRDAHRFEKISNI 
IKQFKL£CSKIAASFQ$MEVRNSNFAAFIDIFTSN 
TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 
DGPKQCLLMR 



3445 



566 



1718 



KGIJERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMEWFTSQRDNIFTWVEVLIYVFDVES^ 

ELEKDMHYYQSCLEA1LQNSPDAKIFCLVHKMD 

LVQEDQRDUFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIWQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKI^CSJQAASFQSMEVRNSNFAAFTOIFTSN 

TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 



3446 



566 



1718 



KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYIARDTRRLGATILDRfflSLQINSSLST 

YSLVDSVGmXTFDVEHSHVRFLGNLVO^WI^ 

GGQDTTMEKYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEnE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFEDIFTSN 

TYVMWMSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 



?447 



2930 | VLLGPLWDKLSTADHPVIVTM A SKRKSTTPCMIP 

VKTV.VLQDASMEA.OPAETLi , .A PEAS. * ; 

ASSEAAQNPSSTV S il-ANGriRSTLDGYLYSCF- ( 
YCDFRSHDMTQFVGHMNSEHTDFT^KDPTFVCSG . 
CSFLAKTPEGLSLHNATCHSGEASFVWNVAKPD 
NHVWEQSIPESTSTPDLAGEPSAEGADGQAEmT 
KTPIMKIMKGKAEAKKnniJKENVPSQPVGEALP 
KLSTGEMEVREGDHSFINGAVPVRQASASSAKN 
PHAANGPLIGTVPVLPAGIAQFLSLQQQPPVHAQ 
HHVHQPUTAKALPKVMIPLSSIPTYSAAMDSNS 
FLKNSFHKFPYPTKAELCYLTVVTKYPEEQLKIW 
FTAQROCQGISWSPEEIEDARKKMFNTVIQSVPQ 
PTTTVLNTPLVASAGNVQHLIQAALPGHVVGQPE 
GTGGGLLVTQPLMANGLQATSSPLPLTVTSVPK 
QPGVAPINTVCSNTTSAVKVVNAAQSLLTACPSI 
TSQAFLDASIYKNKXSHEQLSALKGSFCRNQFPG 
QSEVEHLTKVTGLSTREVRKWFSDRRYHCRNLK 
GSRAMIPGDHRSIIIDSVPEVSFSPSSKVPEVTCIPT 
TATLATHPSAKRQSWHQTPDFTPTKYKERAPEQ 
LRALESSFAQNPLPLDEELDRLRSETKMTRREIDS 
WFSERRKKVNAEETKKAEENASQEEEEAAEDEG 
GEEDLASELRVSGENGSLEMPSSHILAERKVSPIK 
INLKNLRVTEANGRNEIPGLGACDPEDDESNKLA 
EQLPGKVSCKKTAQQRHLLRQLFVQTQWPSNQD 
YDSIMAQTGLPRPEVVRWFGDSRYALKNGQLK 
WYEDYKRGNFPPGIXVIAPGNRELLQDYYMTHK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine 0=Cystdne, D=Aspartic Add, 
E=GIutamic Add, F=Phenyl alanine, OGIydne, H-HlatldJne, 
I=Isolcudne, K^Lysine, L=Leudne, M=Methionine, 
N^Asparaglne, P=Prollne, Q=GluUmine, RpArginine, S=Serine, 
^Threonine, V»VaUnc, W~Tryptophan, Y=Tyrosine, 
X«Unknown, *«Stop codon, /=possible nudeotide deletion, 
V^possibie nudeotide insertion 










MLYEEDLQNLCDKTQMSSQQVKQWFAEKMGEE 

TRAVADTGSEDQGPGTGELTAVHKGMGDTYSE 

VSENSESWEPRVPEASSEPFD\TSSPQAGRQLETD 


3448 


A 


2 


1324 " 


FVARAEKGFRTREAHLLQVAGVGTGLQNGASLS 

GLASGVMAQRAFPNPYADYNKSLAEGYFDAAG 

RLTPEFSQRLTNKIRELLQQMERGLKSADPRDGT 

GYTGWAGIAVLYLHLYDVFGDPAYLQLAHGYV 

KQSLNCLTKRSITF1XGDAGPIAVAA\^YHKMN 

NEKQAEIX^ITRLIHLNKIDPHAPNEMLYGRIGYIY 

ALI^VNKOTGVEKIPQSfflC^ICEmTSGENLAR 

KRNFTAKSPLMYEWYQEYYVGAAHGLAGIYYY 

LMQPSLQVSQGKLHSLVKPSVDYVCQLKFPSGN 

YPPCIGDNRD1XVHWCHGAPGVIYMLIQAYKVF 

R/BREKYLC\DAYQCADVIWQYGLLKKGYGLCY\ 

GSAGNAYAFLTLYNLTQDMKYLYRACKFAEWC 

LEYGEHGCRTPDTPFSU^GMAGTIYFLVADLLFP 

TKAR\FPAFEL 


3449 

j 


A 


3 


2389 


SRHVTGAARSPSRAGPSDPPAMGDEDDDESCAV 

ELRITEANLTGHEEKVSVENFELLKVLGTGAYGK 

VFLVRKAGGHDAGKLYAMKVLRKAALVQRAK 

TQEHTRTERSVLELVRQAPFLVTLHYAFQTDAKL 

HLBLDYVSGGEMFTHLYQRQYFKEAEVRVYGGE 

IVL ALEHIJIKLGIIYRDLKLENVLLJ) SEGHIVLTD 

FGLSKEFLTEEKERTFSFCGHEYMAPEIIRSKTGH 

GKAVDWWSLGELLFELLTGASPFTLEGERNTQAE 

VSRRILKCSPPFPPRIGPVAQDLLQRLLCKDPKKR 

LGAGPQGAQEVRNHPFFQGLDWVALAARKIPAP 

FRPQIRSELDVG\NFAEEFTRLEPVYSPPGQ\PPPG 

DPRIFQGYSFVAPSILFDHNNAVMTDGLEAPGAG 

DRPGRAAVARSAMMQDSPFFQQYELDLREPALG 

QGS^SVCRRr?,C; Tj^OQEFAVKaL S^VEANTOr. 

EVAALRLCQi r ' • JLHEVHHJ^LHTYLVLEL 

J^GGELLEHmK^JaffSESEASQILRSLVSAVSFM 

HEEAG VVHRDLKiTJI TIL YADDTPGAPVKJJDFG/F 

SPRLRPQSPGWMQTVSr iXQYAAPELLAQQGYD 

ESCDLWSLGVILYXMMLSGQAPFQGASGQGGQS 

QAAEIMCKIREGRFSLDGEAWQGVSEEAKELVR 

GLLTVDPAKRIJKLEGLRGSSWLQDGSARSSPPLR 

TPDVLESSGPAVRSGLNATFMAFNRGKREGFFLK 

SVENAPLAKRRKQKLRSATASRRGSPAPANPGR 

APVASKGAPRRANGPLPPS 


3450 


A 


201 


1705 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLRDS 

EDRSDSRAAQPAHDSGHGDDESPSTSSGTAGTSS 

WELPGFYFDPEKKRYFRLI^GHNNCNPLTKESIR 

QKEMESKRLRLLQEEDRRKKIARMGFNASSMLR 

KSQIX3FLNVTNYCHLAJffiLRLSCMERKKV^ 

MDPSAJ^SDRFNLILADTNSDRIJFTVNDVTVGGS 

KYGIINLQSLKTrTLKVFMHENLYFI>nUCV\NSV 

CWASLNHLDSmJXClJ^Gl^JETPGCATLLPASL^ 

vnshpagidrpg\mlcsfripgawscawslniqa 

nncfstglsrrvlltnwtghrqsfgtnsdvla 

qqfalmapllfngcrsgeifaidlrcgnqgkgw 

katri^hdsavtsvrilqdeqyijviasdmagkik 

lwdlrttkcvrqyeghvneyaylplhvheeegi 

lvavgqix:ytriwslhdarl^tipspypaskad 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add resldne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanlne OCysteine, D-Aspartic Acid, 
»=Glutamic Add, ^Phenylalanine, G=G!ytine, Mfistidine, 
l»Isolcudne, K^Lyuoe, L=Leudnc, M=Methfonint» 
N=Aaparagint, P=Proline > Q=G!utam!ne, R^Arginint, S=-Serine, 
Threonine, V=Vallne, W=Tryptophan, Y-OVrosine, 
X=Unknown, *=Stop codon, ^possible nndeotide deletion, 
V=possibIe nndeotide insertion 










IPSVAFSSRLGGSRGAPGLLMAVGQDLYCYSYS 


3451 


A 


19 


6033 


LLSAMLSHGAGLALWTTLSLLQTGLAEPERCNFT 

LAESKASSHSVSIQWRnX3SPCNFSLIYSSDTLGA 

ALCPTFRmNTTYGCNLQDLQAGTIYNFKnSLD 

ERTVVLQTDPLPPARFGVSKEKTTSTGLHVWWT 

PSSGKVTSYEVQLFDENNQKIQGVQIQESTSWNE 

YTFFNLTAGSKYNIAITAVSGGKRSFSVYTNGST 

VPSPVKDIGISTKANSLLISWSHGSGNVERYRLM 

LMDKGELVHGGWDKHATSYAFHGLSPGYLYNL 

TVMTEAAGLQNYRWKLVRTAPMEVSNIJCVTND 

GSLTSLKVKWQRPPG\NVDSYNITLSHKGTIKESR 

VIAPWTRETHFKELVPGRLYNQVTCSAVSLGELS 

AQKMVAVGRTFPDKVANLEANNNGRMRSLVVS 

WSPPAGDWEQYRILIJNDSVVIiNITVGKEETQ 

YVMDGTGLVPGRQYEVEVIVESGNLKNSERCQG 

RTWIAVLQLRVKHANETSLSINfWQTPVAEWEK 

YHSLADRD1JLXIHKSLSKDAKEFTFTDLVPGRKY 

MATVTSISGDLKNSSSVKGRTVPAQVTDLHVAN 

QGN1TSSLFTKWTQAQGDVEFYQVLUHENVVIK 

NESISSETSRYSFHSLKSGSLYSVWTTVSGGISSR 

QWVEGRTVPSSVSGVTVNNSGRNDYLSVSWLL 

APGDVDhnTEVTLSHDGKWQSLVIAKSVRECSF 

SSLTPGRLYTVTITTRSGKYENHSFSQERTVPDKV 

QGVSVSNSARSDYLRVSWVHATGDFDHYEVTIK 

NKNKF1QTKSIPKSENECVFVQLWGRLYSVTVT 

TKSGQYEANEQGNGRTIPEPVKDLTLRNRSTEDL 

HVIWSGANGDVDQYEIQLLFNDMKVFPPFHLVN 

TATEYRFTSLTPGRQYKBLVLTISGDVQQSAFIEG 

FTVPSAVKNIHISPNGATDSLTVNWTPGGGDVDS 

YTVSAFRHSQKVDSQTIPKrR^FFHTFHRLEAGEQ 

YQEvIIASVi3GSLK>;OINWG? I 1 A~7QC7IA~ :; i 

ayssysli/swq; agvaer^tdillltengill^ i 

ntsepattkqhkfedltpgkkykiqeltvsgglfs i 

keaqtegrtvpaavtdlritenstrhlsfrwtas 

egelswyniflynpdgnlqeraqvdplvqsfsfq 

ml(^rmykmvivthsgel^nesfifgrtvpasv 

shlrgsnrnttdslwfnwspasgdfdfyelilyn 

pngtkkenwkdkdltewrfqglvpgrkyvlw 

wthsgdlsnkvtaesrtapsppslmsfadiant 

slaitwkgppdwtdyndfelqwlprdaltvfnp 

ynnrksegrivyglrpgrsyqfnvktvsgdswk 

tyskpifgsvrtkpdkiqnlhcrpqnstaiacsw1 

ppdsdfixjysiecrkmdixjevefsrklekeksll 

nimmlwhkrylvsikvqsagmtsevvedst[t 

mmrpppppphirvnekdvuskssinftvncswfs 

dtngavkyftvvvreadgsdelkpeqqhplpsy 

leyrhnasirvyqtnyfaskcaenpnsnsksfn1 

klgaemeslggkcdptqqkfcdgplkphtayri 

siraftqlfdedlkeftkplysdtffslpntesep 

lfgaiegvsaglfligmlvawallicrqkvshg 

RERPSARLSIRRDRPLSVHLNLGQKGNRKTSCPDC 
INQFEGHFMKLQADSNYLLSKEYEELKDVGRNQ 
SCDIALLPENRGKNRYNNILPYDATRVKLSNVDD 
DPCSDYINASYIPGNNFRREYIVTQGPLPGTKDDF 
WKMVWEQNVHNIVMVTQCVEKGRVKCDHYW 



340 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCTYUS01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCystdne, D=Aspartic Add, 
E=G!utniute Add, ^Phenylalanine, G»Glydne, H=Histidine, 

I = lsf»lfclldnfc. ffc*f .vtin* T j=I ^nrtnr MxiMpfhinninr 

f^Asparagine, P=Proline, Q=Glutamine> R=Arginine, S=Serine, 
T-Threonine, V«Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *-Stop codon, /^possible nucleotide deletion, 
V=possible nudeotide insertion 










PADQDSLYYGDLILQMLSESVLPEWTIREFKICGE 

EQLDAHRURHFHYTVWPDHGVPETTQSLIQFVR 

TVRDY1NRSPGAGPTVVHCSAGVGRTGTFIALDR 

ILQQU5SKDSVDIYGAV\HDLRLHRVHMVQTEC 

QYVYLHQCViaDVUlARKLRSEQEWLFPIYENV 

NPEYHRDPVYSRH 


3452 


A 


63 


1073 


FFRSSSDNGSPIRQYE/HSTPAHQGPVMGLEGKS/ 

AJRNSQLRIVLVGKTGAGKSATGNSILGRKVFHSG 

TAAKSITKKCEKRSSSWKETELVVVDTPGnTDTE 

VPN AETSKEI IRCILLTSPGPHALLL V VPLGRYTEE 

EHKATEKILK^MFGERARSFMILIFraKDDLGDTN 

IJIDYLREAPEDIQDLMDIFGDRYCALNNKATGA 

EQEAQRAQLLGLIQRVVRENKEGCYTNRMYQR 

AEEEIQKQTQAMQELHRVELEREKARIREEYEEK 

IRKLEDKVEQEKRKKQMEKKLAEQEAHYAVRQ 

QRARTEVESKIXjIIJELIMTALQIASFILLRLFAED 


3453 

i 

! 

■ 


A 


2674 


514 


GPITr^KKKAKMKDMPLlUHVLIXjIAITTLVQAV 
DKKVIX^PRLCTCEIRPWFTPRSIYMEASTVDC^ 
IXjLLTFPARlJ^ANTQILLLQTNNIAKffiYSTDFPV 

nltgldlsqnnlssvthhngkkmpqllsvyleen 

kltelpekclselsnlqelyinhnllstispgafig 

lhnllrlhlnsnrlqmd^skwfdalpnleilmig 

e^phrikdmnfkplinlrslvuginlteipdnal 

vglenlesisfydnrlikvphvalqkwnlkfld 

LNKNPmRIRRGDFSmiLHLKELGINNMPELISID 

slavdnlpdlrkieatnnprlsyihpnaffrlpkl 
eslmlnsnalsalyhgtieslpnlkeisihsnpirc 
dcvirwmnmnktnirfmepdslfcvdppefqgq 
nvrqvhfrdmmeiclpliapesfpsnlnveagsy 
vstmcrata\epqpeiywtitsgoklij , nt^tdkf 
yvh^gtldi::gvt^i1^ ^ytcu:^ . 

SViVnK\OXjSFT>QD^ 

KASSKILKSSVKWTAr^ KTENSHAAQSARIPSDV 

KVYNLTHLNPSTEYiaClDI>TIYQKNRKKCVNVT 

TKGLHPIXJKEYEKN>TITTLi\lr-CL^ 

LIS(XSPEMNCDGGHSYVRNYLQKPTFALGELYP 

PLINLWEAGKEKSTSLKVKATVIGLPTNMS 


3454 


A 


1844 


244 


ERYIJFATYVAPSAIIJ>IGIX^EKKKEIYMKIQPP 

FEDLroTAEEYDXIXI^WTKMVKSDQIAYKKV 

ELVEETRQLDSTYrmLQALHKETF^KKAEDTTC 

EIGTGILSLSNVSKRTEYWDNWAEYKHFKFSDL 

l^NKLEr^HFRQFLETHSSSMDLMCWTDffiQFRR 

ITYRDRNQRKAKSIYIia^KYLNKKYr^ 

LYQQNQVMHI^GGWGKILHEQLDAPVLVEIQK 

HVQNRLENVW1PLFLASEQFAARQKKVQMKDI 

AEELLLQKAEKKIGVWKPVESKWISSSCKIIAFRK 

ALLNPVTSRQFQRFVALKGDLLENGLLFWQEVQ 

KYKDLCHSHCDESVIQKKITTIINCFINSSIPPALQI 

DIPVEQAQKnEHRKELGPYWREAQMTFLGVMF 

KFWPQFCEFRKNLTDENIMSVI^RRQEYNKQKK 

KIAViyQNDEKSGKlDGIKQYANTSVPAIKTALLS 

DSFLGLQPYGRQPTWCYSKYIEALEQERILLKIQE 

ELEKVSCLQACNLSQILRLALQLCL 


3455 


A 


228 


3330 


APTAQAMMSFGGAD ALLGAPFAPLHG GGSLHY 
ALARKGGAGGTRSAAGSSSGFHSWTRTSVSSVS 
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Seqid 

NO: 


Method 


Predicted 
beginning 

UHUWUUC 

location 
corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 

locution 

corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanine 0=Cystdne, D=Asparttc Add, 
&=Glatamic Add, Phenylalanine, G=Glydne, H«Histidlne, 

I=TsolpntinpL Ks¥.vslne T =1 .pnrin* M=\ff*thlnnln» 

N-Asparaglne, P^ProlIne, Q^Giutamlne, R=Arginine t S=Serine, 
T»Tnreonine, V-Vaiine, W=Tryptopban, Y=Tyrosine, 
X^Unknown, *=S(op codon, possible nndeotide deletion, 
V 3 possible nudeotide insertion 










ASPSRFRGAGAASSTDSLDTLSNGPEGCMVAVA 

TSRSEKEQLQALNDRFAGYIDKVRQLEAHNRSLE 

GEAAALRQQQAGRSAMGELYEREVREMRGAVL 

RLGAARGQLRLEQEHLLEDIAHVRQRLDDEARQ 

REEAEAAARALARFAQEAEAARVDLQKKAQAL 

QEECGYLRRHHQEEVGELLGQIQGSGAAQAQM 

QAETRDALKCDVTSALREIRAQLEGHAVQSTLQ 

SEEWFRVIUJ3RLSEAAKVNTDAMRSAQEEITEY 

RRQLQARTTELEALKSTKDSLERQRSELEDRHQA 

DIASYQEAIQQLDAELRNTKWEMAAQLREYQDL 

LNVKMALDIEIAAYRKLLEGEECRJGFGPIPFSLP 

EGLPKIPSVSTHDCVKSEEKIKVVEKSEKETVIVEE 

QTEETQVTEEVTEEEDKEAKEEEGKEEEGGEEEE 

AEGGEEETKSPPAEEAASPEKEAKSPVKEEAKSP 

AEAKSPEKEEAKSPAEVKSPEKAKSPAKEEAKSP 

PEVAKSPEKDGKQNFQAEVKSPEKAKSPAKEEAK 

SPAEAKSPEKAKSPVKEEAKSPAEAKSPVKEEAK 

SPAEVKSPEKAKSPTKEEVAKSPEKAKSPEKAKSP 

EKEEAKSPEKAKSPVKAEAKSPEKAKSPVKAEA 

KSPEKAKSPVKEEAKSPEKAKSPVKEEAKSPEKA 

KSPVKEEAKTPEKAKSPVKEEAKSPEKAKSPEKA 

KTLDVKSPEAKTPAKEEARSPADKFPEKAKSPVK 

EEVKSPEKAKSPLKEDAKAPEKEIPKKEEVKSPV 

KEEEKPQEVKVKEPPKKAEEEKAPATPKTEEKK 

DSKKEEAPKKEAPKPKVEEKKEPAVEKPKESKV 

EAKKEEAEDKKKVPTPEKEAPAKVEVKEDAKPK 

EKTEVAKKEPDDAKAKEPSKPAEKKEAAPEKKD 

TKEEKAKKPEEKPKTEAKAKEDDKTLSKEPSKP 

KAEKAEKSSSTDQKDSKPPEKATEDKAAKGK 


3456 

i 


A 


258 

■ 


1463 


YLSFIPGHASKSAPMNGHCFAENGPSQK^ST .PPLL 

if ?CE! nLGPffi?jiDCVVC3FKr;,TVNGV< , . ! a>F .. 

TPIKNSPSLFPCAPLCiiRGSi; 7 - ??LPISEA^SLDDT 

DCEVEFLTSSDTDFLLEDSTLSDFKYDVPGXRRSF 

RGCGQINYAYFDTPAVSAADLSYVSDQNG\GVP 

DPNPPPPQTHRRLRRSHSGPAGSFNKPAIRISNCCI 

HRASPNSDEDKPEVPPRVPIPPRPVKPDYRRWSA 

EVTSSTYSDEDRPPKVPPREPLSPSNSRTPSPKSLP 

SYLNGVMPPTQSFAPDPKYVSSKALQRQNSEGS 

ASKVPCILPIIENGKKVSSTHYYLLPERPPYLDKY 

EKFFREAKKKNGGAQIQPLPADCGISSATEKPDS 

KTKMDLGGHVKRKHLSYVGTP 


3457 


A 


2 


4869 


FILSSSSSASSEHFHHHYSFGNWWPGSFKGHRMS 

LPFYQRCHQHVT)I^RNKDVRSWSHYQREKKR 

SAVYTQGSTAYSSRSSAAHRRESEAFRRASASSS 

QQQASQHALSSEVSRKAASAYDYGSSHGLTDSS 

LLLDDYSSKLSPKPKRAKHSLLSGEEKENLPSDY 

MVPIFSGRQKHVSGITDTEEERIKEAAAYIAQRNL 

LASEEGITTPKQSTASKQTTASKQSTASKQSTASK 

QSTASRQSTASRQSWSKQATSALQQEETSEKKS 

RKVVHOIGKAERLSUIKTLEETETYHAKLNEDHLL 

HAPEFHKPRSHTVWEKENVKLHCSIAGWPEPRV 

TWYKNQWINVHANPGKYIIESRYGMHTLEINAC 

DFEDTAQYRASAMNVKGELSAYASVVVKRYKG 

EFDETRFHAGASTMPLSFGVTPYGYASRFE1HFD 

DKFDVSFGREGETMSLGCRWITPEIKHFQPEIQ 
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SEQID 
NO: 


Method 


Predicted 

begfoning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Ainnine OCystdne, D^Aspartk Add, 
E=Glutamic Add, {^Phenylalanine, G^GIydne, H=HisHdine, 
I=Isolendne, K— Lysine, I^Leudne, M^^W ethionlne, 
N«Asparaglne, P^Proline, Q=Glntamine, R=Arginine, S=Serine, 
T-Threonine, V«VaIine, W=Tryptopban, Y»Tyrosine, 
X=Uoknown, *=Stop codon, /^possible nudcotide ddetion, 
^possible nudeotide insertion 




V* : 


• 


• 


WYRNGVPLSPSKWVQTLWSGERATLTFSHLNKE 

DEGLYTIRVRMGEYYEQYSAYVFVRDADAEIEG 

APAAPLDVKCLEANKDYmSWKQPAVDGGSPIL 

GYFIDKCEV GTDS WSQCNDTPVKFARFPVTGLIE 

GRSYEFRVRAVNKMGIGFPSRVSEPVAALDPAEK 

AMJCS/PPI^TUDWTNVIVTEEEPSEGIV^ 

VTEATRSYV\nLSWKPPGQRGHEGMYFVEKCEA 

GTENWQRVNTELPVKSPRFALFDLAEGKSYCFR 

VRCSNSAGVGEPSEATEVTWGDKLDIPKAPGKI 

IPSRNTDTSVWSWEESKDAKELVGYY1EANVA 

GSGKWEPCNNNPVKTHRFTCHGLVTGQSYIFRV 

RAVNAAGLSEYSQDSEAIEVKAA1APPSPPCDITC 

LESFRDSMVLGWKQPDKIGGAEITGYYVNYREV 

IDGVPGKWREANVKAVSEEAYKISNLKENMVY 

QFQVAAMNMAGLGAPSAVSECFKCEEWTIAVP 

GPPHSLKCSEVRKDSLVLQWKPPVHSGRTPVTG 

YFVDOCEAKAKEDQWRGLNEAAIKNVYLKVRG 

LKEGVSYVFRVRAINQAGVGKPSDLAGPWAET 

RPGTKEVVVNVDDIX5VISLNFECDKMTPKSEFS 

WSKDYVSTEDSPRLEVESKGNKTKMTFKDLGM 

DDLGIYSCDVTDTOGIASSYLIDEEELKRLLALSH 

EHKFPTVPVKSEIAVEILEKGQVRFVWMQAEKLS 

GNAKVNYIFNEKGIFEGPKYKMHIDRNTGIIEMF 

MEKLQDEDEGTYTFQLQDGKATNHSTWLVGD 

WKKLQKEAEFQRQEWIRKQGPHFVEYLSWEVT 

GECNVLLKCKVANIKJCETHIVWYKDEREISVDE 

KHDFKDGIC1XLITEFSKKDAG1YEVILKDDRGK 

DKSRLKLVDEAFKELMMEVCKK1ALSATDLKIQ 

STAEGIQLYSFVTYYVEDLKVNWSHNGSAIRYSD 

RVKTG\nTGEQIWLQINEPTPNDKGKYVMELFDG 

l".':7 rHQKTvTDLiSCC.LYDEAYAE^QRJ-KQAATAIZK 

NxsAiLVLGGLPDV VTIQEGKALNLTCNVWC. ; - .'tt> 

^SWLKNEKALASDDHCNUO^EAGRTAYFTING 

VTTADSGKYGLVVKNKYGSETSDFTVSVFIPEEE 

ARMAALESLKGGKKAK 


3458 


A 


3963 


827 


LSRSSSDNmOTLGRNVMSTATSPLMGAQSFPNL 

TTPGTTSTVTMSTSSVTSSSNVATATTVLSVGQS 

LSNTLTTSLTSTSSESDTGQEAEYSLYDFLDSCRA 

STLLAELDDDEDLPEPDEEDDENEDDNQEDQEY 

EEVNOLRRPSLQRRAGSRSDVTHHAVTSQLPQVP 

AGAGSRPIGEQEEEEYETKGGRRRTWDDDYVLK 

RQFSALWAFDPRPGRTRVQQTTDLEIPPPGTPHS 

EIJJBEVECTPSPRLALT1XVTGLGTTREVELPLTN 

FRSTIFYYVQKLIXJI^CNG^KSDKIJRRIWE 

HMYREMKDSDKEKENGKMGCWSIEHVEQYLG 

TDELPKNDLITYLQKNADAAFLRHWKLTGTNKS 

IRKNRNCSQLIAAYWDLG\EHGTK\SGLNQGAIST 

LQSSDILNLTKEQPQAKAGNGQNSCGVEDVLQL 

LRILYIVASDPYSRISQEDGDEQPQFTFPPDEFTS/ 

KKITTKILQQIEEPLALASGALPDWCEQLTSKCPF 

LIPFETRQLYFTCTAFGASRAIVWLQNRREATVE 

RTRTTSSVRRDDPGEFRVGRLKHERVKVPRGESL 

MEWAENVMQIHADRKSVLEVEFLGEEGTGLGPT 

LEFYALVAAEFQRTDLGAWLCDDNFPDDESRHV 

DLGGGLKPPGYYVQRSCGLFTAPFPQDSDELERI 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

seqnenee 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Atanine OCystdne, D=Aspar(lc Add, 
£=Glutamlc Add* ^Phenylalanine, 0=Glydne, H»Hi$tidine, 
l=IsoIeudne, K=Lysine, I>=Leudne, M=Methionine, 
N=Asparagine, P^Proline, Q^GIutamine, R^Arginine, S=Serine, 
T=Tureoninc, V«Valine, W-*Tryptophan, Y=Tyrosine, 
X~Un known, *=Stop codon, /^possible nndeotide deletion, 
\=pos$ible nudeotide insertion 










tklfhflgif1akciqdnrlvdlpiskpffk1jmcm 

gdjksnmskliyesrgdrdlhciesqseasteeg 

hdslsvgsfeedsksefildppkpkppawfngilt 

wedfelvnphrarfijceikd1aikrrqilsnkgl 

sedekktklqelvlknpsgsgpplsiedlglnfqf 

cpssriygftavdlkpsgedemitmdnaeeyvdl 

mfdfcmhtglqkqmeafrdgfnkvfpmeklssf 

sheevqmilcgnqspswaaediinytepklgytr 

dspgflrfvrvix:gmssderkaflqfttgcsilp 

pgglam.hprltvvrkvdatdasypsvntcvhy 

lklpeysseeimrerllaatmekgfhln 


3459 


A 


88 


603 


SCGPRGLASLGLGFSGRCDDQNKGRS\DGPEAQA 

EACSGERTYQELLVNQNPIAQPLASRRLTRKLYK 

CIKKAWQKQIRRGVKEVQKPVNKGEKGIMVLA 

GDTlJIEVYCrnj'VMCEDRM^PYVYTPSKTDLGA 

AAGSKRPTCVIMVKPHEEYQEAYDECLEEVQSL 

PLPL 


3460 


A 


139 


1997 


QVTh^SDKSEIJCAEIJaRKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDWAPKPPIEPEEEK 

TLKKDEEN\DSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLWGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NWGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MEL\^QSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYT 'JRT-T1S7.: \GK^,^ CRQGPlTGninivv 

VGAVDFSHLYVTSSFD /r Tv^vVTTKNNKPLYSF 

EDNAGYVYDVMWSPTrlP, JLFACVDGMGRLDL 

WhO>NNDTEVPTASISVEGN?A^NRVRWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPIvhiuEWARFGRTL 

AEINANRADAEEEAATRIPA 


3461 


A 


139 


1997 


QVTNMSDKSEIJCAELERKKQRI^QIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPTVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDWAPKPPIEPEEEK 

TIJCKDEEN\DSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQIN1FFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFFVDER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NWGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSFDWTVKLWTTKNNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTBVPTASISVEGNPALNRVRWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteine, D=Aspartic Add, 
E=GIutamic Add, ^Phenylalanine, G=Grydne, H=Histidine 1 
t=*I$oleudne, K«Lysine, L^Leudne, M=Methionine, 
N^Asparagine, P=ProJine, Q=GlutamIne, R=Arginine, S=Serine, 
'M'hreonine, V=Vallne, W=Tryptophan, Y«Tyrosine, 
X»TJnknowo, *«Stop codon, A=p<xssible nodeotide ddetion, 
V=possible nudeotide insertion 


3462 


A 


2 


2643 


TAPEFSRSTHASAHASVARVLRNREIAQLKKEQR 

RQEFQIRALESQKRQQEMVLRRKTQEVSALRRL 

AKPMSERVAGRAGLKPPMLDSGABVSASTTSSE 

AESGARSVSSIVRQWNRKINHFIXyDHPAPTVNGT 

RPARKKFQKKGASQSFSKAARLKWQSLERRIIDI 

VMQRMTIVNLJBADMER^ 

KRERLQAESPEEEKGLQELAEEIEVLAANIDYIND 

GITDCQATrVQLEETKEELDSTDTSVVISSCSLAE 

ARLLU3NFLKASIDKGLQVAQKEAQIRLLEGRLR 

QTDMAGSSQKHLLLDAIJREKAEAHPELQALIYN 

VQQENGYASTDEEISEFSEGSFSQSFTMKGSTSH 

DDFKFKSEPKLSAQMKAVSAECLGPPLDISTKNI 

TKSLASLVEIKEDGVGFSVRDPYYRDRVSRTVSL 

PTRGSTFPRQSRATETSPLTRRKSYDRGQPIRSTD 

VGFTTPSSPrTRPRNDRNVFSRLTSNQSQGSALD 

KSDDSDSSL\SEVLRGIISPVGGAKGARTAPLQCV 

SMAEGHTKPILCLDATDELLFTGSKDRSCKMWN 

LVTGQEIAALKGHPNNWSIKYCSHSGLVFSVST 

SYIKVWDIRDSAKCIRTLTSSGQVISGDACAATST 

RAITSAQGEHQINQIALSPSGTMLYAASGNAVRI 

WELSRFQPVGKLTGHIGPVMCLTVTQTASQHDL 

VVTGSKDHYVKMr^GECVTGTIGPraNFEPPH 

YDGDECLAIQGDILFSGSRDNGIKKWDLDQQELIQ 

QEPNAHKDWVCALAFIPGRPMLLSACRAGVIKV 

WNVDNFTPIGEIKGHDSPINAICTNAKHLFTASSG 

CRVK\nVNYVrXjLTTCLPRRVKAJKGRATTLP 


3463 


A 


198 


3146 

* 


SGEPRPEPGNMATCIGEKIEDFKVGNLLGKGSFA 

G VYRAES IHTGLEVAIKMIDKKAMYKAGMVQR 

VQNEVKIHCQLKHPSILELYNYFEDSNYVYLVLE 

MCHNGEMNRYLKh^VKPFSENEARHFMHQIITG 

^ILYlJiSHGnH^^LTLSNLLLTTlT!! INDCJ/.DFGL i 

A iQLKMjrmKHY71£GTPOT 

SDWSLGCMrTTIXIGRPPFDTm^KNTL^ 

LADYEMFTFLSIEAKDLIHQLLRRNPADRLSLSSV 

U>H?FMSKNSSTKSKDLGTVEDSIDSGHATISTAI 

TASSSTSISGS1JT)KRRLLIGQPIJPNKMTVFPK2^ 

SSTDFSSSGDGNSFYTQWGNQETSNSGRGRVIQD 

AEERPHSRYLRRAYSSDRSGTSNSQSQAKTYTM 

ERCHSAEMLSVSKRSGGGENEERYSPTDNNANEF 

NFr^KTSSSSGSFERPDNNQALSNHLCPGKTPFP 

FADPTFXJTBTVQQWFGNIXJINAHLRKTTEYDSIS 

PNRDF(^HPDLQKDTSKNAWTDTKVKKNSDAS 

DNAHSVKQQNTMKYMTALrlSKPEIIQQECVFGS 

DPLSEQSKTRGMEPPWGYQNRTLRSITSPLVAHR 

LKPIRQKTKKAVVSBLDSEEVCVELVKEYASQEY 

VKEVLQISSDGNTITIYYPNGG\RGFPLA\DRPPSP 

TVDNISR\YSrAD>fiLPEKYWRKYQYASRFVQLVRS 

KSPKITYFTRYAKCILMENSPGADFEVWFYDGV 

KIHKTEDHQVIEKTGKSYTIJCSESEVNSLKEEIK 

MYMDHANEGHRIOLALESIISEEERKTOSAPFFPn 

IGRKPGSTSSPKAI^FPPSVDSNYPTRDRASFNRM 

VMHSAASPTQAPILWSMVTNEGIXjLTTTASGTD 

ISSNSUCDCLPKSAQLLKSVFVKNVGWATQ\LTS 

GAVWVQFN1X}SQLVVQAGVSSISYTSPNGQ\TTR 

\YGENEKLPDYIKQKLQCLSSILLMFSNPTPNFH 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine OCystdne, l>=Aspartic Acid, 
E=Giutamic Add, Phenylalanine, G=Grydne, FNHistidlne, 
f^Isoieudne, K—LysJne, L^Leudne* M=Me thio ni oe^ 
N^Asparagine, P^ProIine, Q=Glutamlne, R-Arginioe, S»Serine, 
l^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
XKJoknown, ^top cod on, /"possible nndeotide deletion, 
\=posslWe nucleotide insertion 


3464 


A 


14 


348 


AVRTVSGTSLGPRSHSRSPGRCHCFSAVTFSSPRL 
AASEAPDPMEBWDVPQMKKEVESLKYQLAFQR 
EMASKTIPELLKWIEDGIPKDPFLNPDLMKNNPW 
WEKGKCTIL 


3465 


A 


5537 


405 


VRKLDRERVGAWWRGAWARHPRQEAGEHAKR 

RKGHAETPRGRRKGRAGRSAAAVGELRPARRSL 

ETSRAAAAMAKDSPSPLGASPKKPGCSSPAAAV 

LENQRRELEKLRAELEAERAGWRAERRRFAARE 

RQLREEAERERRQLADRLRSKWEAQRSRELRQL 

QEEMQREREAEIRQLLRWKEAEQRQLQQLLHRE 

RDGWRQARELQRQLAEELVNRGHCSRPGASEV 

SAAQCRCRLQEVLAQLRWQTDGEQAARIRYLQ 

AALEVERQU1JCYILAHFRGHPALSGSPDPQAVH 

SLBEPLPQTSSGSCHAPKPACQLGSLDSLSAEVG 

VRSRSLGLVSSACSSSPDGLLSTHASSLDCFAPAC 

SRSLDSTRSLPKASKSEERPSSPDTSTPGSRRLSPP 

PSPLPPPPPPSAHRKLSNPRGGEGSESQPCEVLTPS 

PPGLGHHELIKLNWIXAKALWVLARRCYTLQEE 

NKQLRRAGCPYQADEKVKRLKVKRAELTGLAR 

RLADRARELQETNLRAVSAPIPGESCAGLELCQV 

FARQRARDLSEQASAPLAKDKQIEELRQECHLLQ 

ARVASGPCSDLHTGRGGPCTQWLNVRDLDRLQ 

RESQREVLRLQRQLMLQQGNGGAWPEAGGQSA 

TCEEVRRQMLALERELDQRRRECQELGAQAAPA 

RRRGEEAETQLQAALLKNAWLAEENGRLQAKT 

DWVRKVEAENSEVRGHLGRACQERDASGLIAEQ 

LLQQAARGQDRQQQLQRDPQKALCDLHPSWKEI 

QALQCRPGHPPEQPWETSQMPESQVKGSRRPKF 

HARAEDYAVSQPNRDIQEKREASLEESPVALGES 

ASVPQVSETVPASQP! SXKTSSQSNSSSEGSMWA 

TV^PTf ^ItDTASE V/^ETDS /SLA; "wL : 1 -Ji 

APAAPK^JPMAQYM /NPFEGPNDHPii IhPJJri* i X 

GDYIYffGDMDEIXjFYEGELEIXjRRGLVr^NFVE 

QIPDSYIPGCLPAKSPDLGPSQLPAGQDEAL£30S 

IXSGKAQGVVDRGLCQMVRVGSKTEVATEILDT 

KTEACQLGLLQSMGKQGLSRPLLGTKGVLRMAP 

MQLHLQNVTATSANITWVYSSHRHPHVVYLDD 

RErL\LTPAGVSCYTFQGLCPGTHYRARVEVRLP 

RDIXQVYWGTMSSTVTFDTLLAGPPYPPLDVLV 

ERHASPGVLWSWLPVTIDSAGSSNGVQVTGYA 

VYADGLKVCEVADATAGSTLLEFSQLQVPLTWQ 

KVSVRTMSLCGESLDSVPAQIPEDFFMCHRWPET 

PPFSYTCGDPSTYRVTFPVCPQKLSLAPPSAKASP 

HM^jSCGEPQAKFLEAFFEEPPRRQSPVSNLGSE 

GECPSSGAGSQAQELAEAWEGCRKDLLFQKSPQ 

NHRPPSVSDQTGEKENCYQHMGTSKSPAPGFIHL 

RTECGPRKEPCQEKAALERVLRQKQDAQGFTPP 

QLGASQQYASDFHNVLKEEQEALCLDLWGTERR 

EERREPEPHSRQGQALGVKRGCQLHEPSSALCPA 

PSAKVIKMPRGGPQQLGTGANTPARVFVALSDY 

NPLVMSANLKAAEEELVFQKRQLLRVWGSQDT 

HDFYLSECNRQVGNIPGRLVAEMEVGTEQTDRR 

WRSPAQGHLPSVAHLEDFQGLTIPQGSSLVLQGN 

SKRLPLWTPKIMIAALDYDPGDGQMGGQGKGRL 

AIJRAGDVVMVY\GPMDDQGFYYGELGGHRG\L 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Add, 
E^Glutamic Add, ^Phenylalanine, G-Glycine, H^Histldine, 
I=Isoleudne, K=Lysine, L=Leudnt, WNMethionine, 
N=Asparagine, P=ProIlne, Q=GlutamIne, R«Arginine» S-Serinc, 
T-Threonine, V=Valine, W-Tryptopban, Y-Tyrosine, 
X s Unknown, *«5top codon, /"possible nudeotidc deletion, 
\-possible nudeotide insertion 










VPANLRIKMSSQGH 


3466 


A 


1 


1111 


MSKPPDLLLRIXRGAPRQRVCTIJFnGFKFTFFVSI 

MIYWHWGEPKEKGQLYNLPAEIPCPTLTPPTPP 

SHGPTPGNIFFLETSDRTNPNFIJ^CSVESAARra 

PESHVLVLMKGLPGGNASLPRHLGISLLSCFPNV 

QMLPIJDLREU^RDTPI^DWYAAVQGRWEPYLL 

PVLSDASR1ALMWKFGGIYLDTDFIVLKNLRNLT 

NVUjTQSRYVLNGAFIJ^FERRHEFMALCMRDFV 

DHYNGWIWGHQGPQLLTRVFKKWCSIRSLAESR 

ACRGVTTLPPEAFYPIPWQDWKKYFEDINPEELP 

RLLSATYAVHVWNKKSQGTRFEATSRALLAQLH 

ARYCPTTHE/DHENVLVKGPAGHLPNLLLMGHW 


3467 


A 


1 


2175 


MAKVILKQSKQCKNLLTCKVAQVCPVCGCLHC 

YFWWl^GLESRRPSSPLIDIKPIEFGVLSAKKEPIQ 

PSVLRRTYNPDDYFRKFEPHLYSLDSNSDDVDSL 

TDEEILSKYQLGMLHFSTQYDLU1NHLTVRVIEA - 

RDIJTPISHIXjSRQDMAHSNPYVKICIXPDQKNS 

KQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLLL 

TVVDFDKFSRHCVIGKVSyPLCEVDLVKGGHW 

WKAHDSQFSAPGLPADQQFFADLFSGLVLNPQL 

LGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG 

NLLEAKQQRLVEGEMLFIPARAANLPVNNKPVM 

LLSLWAPTWLGLSFYDSRTTSLLHPARQIQLPVSL 

QRGEGEAMLSXALTLFSRSPLEQNHQPLVLSLLHL 

CGSVVNMPPGNSQPRGDFLYHSICTWVQDNYAQ 

PLTRESVAQFFNITPKHLSKLFAQHGTMRFIEYVR 

WVRMAKARMILQKYHLSIHEVAQRCGFPDSDYF 

CRWRRQFGMDYVDCLQIHRWDYNTPffiETLEAL 

NDWKAGKARYIGASSMHASQFAQALELQKQH 

GWAQFVSMQDHYNLIYREEEREMJ PLCYQEGV 

A VIPWSPLA TR?WCI*TTARL\ ~ EV~KNT 

YKESDENDAQIAERL 10 \ ^ELGATiUQVALAW 

IJLSKPGIAAPnGTSREEQLDELLNAVDITLKPEQI 

AELETPYKPHPWGFK 


3468 


A 


147 


3209 


ALPLPLPTLYPGMSRRKQRKPQQLISDCEGPSASE 

NGDASEEDHPQVCAKCCAQFTDPTEFLAHQNAC 

STDPPVMVEGGQENPNNSSASSEPRPEGHNNPQ 

VMDTEHSNPPDSGSS VPTDPTWGPERRGEES SGH 

FLVAATGTAAGGGGGLELASPKLGATPLPPESTP 

APPPPPPPPPPPGVGSGHLNIPLILEELRVLQQRQI 

HQMQMTEQICRQVLLLGSLGQTVGAPASPSELP 

GTGTASSTKPLLPLFSPJKPVQTSKTLASSSSSSSS 

SSGAETPKQAFFHLYHPLGSQHPFSAGGVGRSHK 

PTPAPSPAI^GSTIXJLIASPHIJ^STTGLLAAQC 

LGAARGLEATASPGLLBCPKNGSGELSYGEVMGP 

LEKPGGRHKCRFCAKVFGSDSALQIHLRSHTGER 

PYKCNV CGNRFTTRGNLKVHFHRHREKYPHVQ 

MNPHPVPEHLDYVITSSGLPYGMSVPPEKAEEEA 

ATPGGGVERKPLVASTTALSATESLTLLSTSAGT 

ATAPGLPAFNKFVLMKAVEPKNKADENTPPGSE 

GSA1SGVAESSTATRMQLSKLVTSLPSWALLTNH 

FKSTGSFPLPLCARALGVASPSETSKLQQLVEKID 

RQGAVAVTSAASGAPTTSAPAPSSSASSGPNQCV 

ICLRVLSCPRALRLHYGQHGGERPFKCKVCGRAF 

STRGNIJRAHFVGHKASPAARAQNSCPICQKKFT 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqaence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=Glutamlc Add, ^Phenylalanine, G^GIydne, H-Hlstidine, 
f=*Isoteucine, K=LysJne, L=Leudne, M-MethlonJnt, 
N=Asparagtne, P^ProIioe, Q°GIutamine, R«Arginine, S=£erine, 
T=Threonrac, V^Valine, W=Tryptophan, Y=Tyrosine, 
X°Unknown, *=Stop codon, /"possible nndeotide ddetion, 
V=possibIe nucleotide insertion 










NAVTLQQHVRMHLGGQIPNGGTALPEGGGAAQ 

ENGSEQSTVSGAGSFPQQQSQQPSPEEELSEEEEE 

EDEEEEED VTDEDSLAGRGSESGGEKA] S VRGDS 

EEASGAEEEVGTVAAAATAGKEMDSNEKTTQQS 

SLPPPPPPDSLDQPQPMEQGSSGVLGGKEEGGKP 

ERSSSPASALTPEGEATSVTLVEELSLQEAMRKEP 

GESSSRKACEVCGQAFPSQAAL\EEH\QKTHPKEG 

PLrATCWCRQGFLERATLKKHMLLAHHQVQPFA 

PHGPQN1AALSLVPGCSPSITSTGLSPFPRKDDPTI 

P 


3469 


A 


3 


5664 


NLRPl^FALFLGDPNMANLJEESFPRGGTRKJHKP 

EKAFQQSVEQDNLFDISTEEGSTKRKKSQKGPAK 

TKKLKIEKRESSKSAREKFEILSVESLCEGMRILG 

CVK^VNELELWSIJ'NGLQGFVQVTEICDAYTKK 

UTCQVTQEQPUO)IXHIJ^ 

GITDRGKKSVKI^LNPK>rVNRVLSAEALKPGML 

LTGTVSSLEDHGYLVDIGVDGTRAFLPLLKAQEY 

IRQKNKGAKLKVGQYLNCIVEKVKGNGGVVSLS 

VGHSEVSTAIATEQQSWNLNNLLPGLWKAQVQ 

KVTPFGLTLNFLTFFTGWDFMHLDPKKAGTYFS 

NQAVRACILCVHPRTRWHLSLRPIFLQPGRPLTR 

LSCQNLGAVLDDVPVQGFFKKAGATFRLKDGVL 

AYARI^HLSDSKNVFWEAFKPGNTOKCRnDYS 

QMDELALLSLRTSIIEAQYLRYHDIEPGAWKGT 

VLTIKSYGMLVKVGEQMRGLVPPMHLADILMK 

WEKKYKGDEVKCRVLLCDPEAKKLMMTLKKT 

LIESKLPVITCYADAKPGLQTHGFIIRVKDYGCIV 

KFYNNVQGLVPKHELSTEYIPDPERVFYTGQVV 

KVVVLNCEPSKERMLLSFKLSSDPEPKKEPAGHS 

QKKGKAINIGOLVDVKVLEKTKDGLEVAVI PHN 

IIL*JFIJTrSHL7^ " ^DILKRVL j 

G C^SEGR^iXORKPALVST^^^ 

PGMmGFVKSIKDYGVnQU^SCT-.SGlJ^PKAIMS 

DKFVTSTSDHFVEGQTVAAKVTO \7)EEKQRMLL 

SLRLSIX;GUrDIAITSLIII^QCLEEl^i\ : RSLM 

SNRDSVUQTLAEMTPGMFLDLVVQEVLEDGSV 

VFSGGPVPDLVLJLASRYHRAGQEVESGQKKKVV 

ILNVDliiCLEVHVSLHQ\DLV\NRKARKLRKGSE 

HQAWQHLEKSFAIASLVETGHLAAFSLTSHLND 

TFRFDSEKLQVGQGVSLTLKTTEPGVTGLLLAVE 

GPAAKRTMRPTQKDSETVDEDEEVDPALTVGTI 

KKHTl^IGDMVTGTVKSIKPTrlVVVTLEDGIIGCI 

HASHILDDWEGTSPTTKLKVGKTVTARVIGGRD 

MKTFKYLPISHPRFVRTIPELSVRPSELEDGHTAL 

NTHSVSPMEKIKQYQAGQTVTCFLKKYNVVKK 

WLEVEIAPDIRGRIPLLLTSLSFKVLKHPDKKFRV 

GQALRATWGPDSSKTFLCLSLTGPHKLEEGEVA 

MGRVVKVTr^GLTVSFPFGKIGTVSEFHMSDSY 

SETPLEDFVPQKVWCmSTADNVLTLSLRSSRT 

NPETKSKVEDPEINSIQDIKEGQLLRGYVGSIQPH 

GVtTRUjPSVVGIjVRYSHVSQHSPSKKALYNKH 

LPEGKLLTARVLRLNHQKNLVELSFLPGDTGKPD 

VLSASLEGQLTKQEERKTEAEERDQKGEKKNQK 

RNEKKNQKGQEEVEMPSKEKQQPQKPQAQKRG 

GRECRESGSEQERVSKKPKKAGLSEEDDSLVDV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 4 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»AJanine OCystelne, D=Aspartic Add, 
E=Glutamfc Add, ^Phenylalanine, G=Grydne, H=Histidine, 
I=l50ieudoe, K-Lysine, L=Leudne, M=Methionine, 
N-Asparagine, P^Prollne, Q-Glutamlne ( R^ArgJnine, S**Serine, 
T^Threonine, V-Valine, W<=Tryptophan, Y^Tyrosine, 
X=Unknown, **=Stop codon, /^possible nucleotide ddetlon, 
\=possible nudeotide Insertion 










YYREGKEEAEETNVU>KEKQTKPAEAPRLQLSSG 

FAWNVGLDSLTPALPPLAESSDSEEDEKPHQATI 

KKSKKERELEKQKAEKELSRTEEALMDPGRQPE 

SADDHDRLVl^SPNSSILWLQYMAFHLQATEIEK 

ARAVAERALKTISFREEQEKLNVWVALLNLENM 

YGSQESLTKVFERAVQYNEPLKVFLHLADIYAKS 

EKFQEAGELYNRMLKRFRQEKAVWIKYGAFLLR 

RSQAAASHRVLQRALECLPSKEHVDVIAKFAQL 

EFQLGDAERAKADTBNTLSTYPKRTDVWSVYID 

MTDCHG SQKDVRDIFERVIHI^LAPKRMKFFreJR 

YLDYEKQHGTEKDVQAVKAKALEYVEAKSSVL 

ED 


3470 


A 


2334 


1226 


TAAAP V APGTMDDATVLRKKG YIVGINLGKGS Y 

AKVKSAYSERIJCFNVAVKIIARKKTPTDFVERFL 

PREMDI1j\TVNHGSIIKTYEIFETSDGRIYIIMELG 

VQGDIXEFIKCQGALHEDVARKNIFRQLSSAVKY 

CHDLDIVHRDLKCENLLLDKDFNIKLSDFGFSKR 

CLRDSNGRIILSKTFCGSAAYAAPEVLQSIPYQPK 

VYDIWSLGVILYIMVCGSMPYDDSDIRKMLRIQK 

EHRVDFPRSKNLTCECKDLIYRMLQ\PDVS\KRLH 

IDEILSHS WLQPPKPKAATSSA SFKREGEGKYRAE 

CKLDTKTGLRPDHRPDHKLGAKTQHRLLVVPEN 

ENRMEDRLAETSRAKDHHISGAEVGKAST 


3471 


A 


537 


148 


TERGAPQHPTLPLPSLTPSSVHTGQPKTTPSVILFL 
PSCEEPQANKATLVCLMNN/FYPGELMVTWKAD 
GTLITQSVEKTTPSKQSNNKYVASSYLSLTPEQW 
RSRRSYSCQVMQEGSTVEKSVAPAECS 


3472 

i 


A 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 
WLPNHWFLRLREGLKNQSPTEAEKPASSSLPSS 
PPPQLLTRNWFGLGGFLFLWDGEDSSFLWRLR 

GP7GGG2ET^ rQYQRLLCIl\?ri™nYQVLL£?7 
QHHVAL&lKG^^ 

VNCSTTPVAiiRFFreSTSLTLKHAAWYPSEILbPH 

VVLLTSDNVIRIYSIJREPQTPT^^ 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEWAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHASXAAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 

IDOPSLYVFECVEIJELALKLASGEDDPFDSDFSC 

PVKIJfflRDPKCPSRYHCraEAGV^ 

HKFLGSDEEDKDSlXJELSTEQKCFVEHEXnXPLP 

CRQPAPIRGFWIVPDIU}PTMICITSTYECLIWPLL 

STVHPASPPIXCTREDVEVABSPLRVLAETPDSFE 

KHIRSI1XJRSVANPAFLKASEKDIAPPPEECLQLLS 

RATQVFREQYIIJCQDLAKEEIQRRVKIXCDQKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMNRMKKLLHSFHSBLPVLSDSERDMKKEL 

QLIPIXJLRHLGNAIKQVTMKKDYQQQKMEKVL 

SLPKPTIII^AYQRKCIQSII^EGEHIREMVKQIN 

DIRNHVNF 


3473 


A 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNHWFLRLREGLKNQSPTEAEKPASSSLPSS 

PPPQLLTRNVVFGLGGELFLWDGEDSSFLWRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGDCGLMVLELPKRWGKNSEFEGGKST 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alaninc OCystdne, D»Aspartic Add, 
E=G!utamic Add, P-Pbenylalaoine, OGlydne, H=Histidioe, 
I=Iso!eudne, K-Lysine, L=Lcudne, M=Methtonlne, 
NRAsparaglne, P=ProHne, Q=Glutamine, R»Argtotoe, S=Serlne, 
T^Threonlne, V«Valine, W^OVyptophan, Y=Tyrosine, 
X«=Unknown, *=Stop codon, /"possible nndeotide deletion, 
V=possible nudeotide insertion 










VWCSTTPVAERFrTSSTSLTUOI^ 

\m.LTSDNVIRIYSLRErX5TrTN^ILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEWAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHASVAAEDWGYDACAVLCLPCVPN 

ILV1ATESGMLYHCWLEGEEEDDHTSEKSWDSR 

EDUPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFIXJSDEEDKDSLQELSTEQKCFVEHEXTKPLP 

CRQPAPIRGFWIWDILGPTMICITSTYECL1WPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKD1APPPEECLQLLS 

RATQVFREQYILKQDlJVKEEIQRRVKIXCIXJKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMNRMKKLLHSFHSEU>VI^DSERDMKKa 

QLIPIMJLRHLGNAIKQVTMKKDYQQQKMEKVL 

SI^KPTIILSAYQRKCIQSIIJCEEGEHIREMVKQIN 

DIRNHVNF 


3474 

j 


A 

■ 


4344 

■ 


2550 


DRRREPERHVRVKQRTSVLNMLRRLDKIRFRGH 

KRDDr^Dl^SPNASDTECSDEIPIJCVPRTSPRDS 

EELRDPAGPGTLIMATGVQDFNRTEFDRLNEIKG 

HLEIALLEKHFLQEELRKLREETNAEMLRQELDR 

ERQRRMELEQKVQEVLKARTEEQMAQQPPKGQ 

AQASNGAERRSQGLSSRLQKWFYERFGEYVEDF 

RFQPEENTVETEEPLSARRLTENMRRLKRGAKPV 

TI>n ; VKNLSAl^DWYSVYTSAIAFIV^ 

GWAIPLFLFLAILRLSLNYLIARGWRIQWSIVPEV 

SEPVEPPKEDLTVSEKFQLVLDVAQKAQMLFGK 

MADILEKIKNLFMWVQPEITQKLYVALWAAFLA 

SCFFPYRLVGLAVGLYAGDCFFLIDFTFKRCPRLR 

AXYDTPYHV/P^LPTDPOTJ^RSSA- 7 'RRl ~»T7<? 

SRSYVPSAPAGLCiKEEi \C<^,FHS r i 1 K.-:GMFHKFN 

LTENERP1AVCENGWRCCIJNRDRKMPTDYIRN 

GVLYVTIENT^CFESSKSGSSKR*^^ 

QKYKVLSVLPGSGMGIAVSTPSTQKPLVFGAMV 

HRDEAFETTLSQYIKITSAAASGGDS 


3475 


A 


2 


1126 


TAARRRQKGAAAAAETHGQAKAKSGWLKPYYF 

IELMESRKDITNQEELWKMKPRRNLEEDDYLHK 

iyTGETSMLKRPVLLHLHQTAHADEFDCPSELQH 

TQEUTQWHU>IKIAA^ 

TSH(^YFraPILVINKVLPMVSnLIjU.VYLroV 

IAA1VQLHNGTKYKKFPHWLDKWMLTRKQFGL 

LSFFFAVUIAIYSI^YPMRRSYRYKLLNWAYQQ 

VQQNKEDALVffiHDVWRMEIYVSLGIVGLAILAL 

LAVTSffSVSDSLTWREFHYIQSKLGIVSLLLGTIH 

ALD'AWNKWIDIKQFVWYTPPTFMIAVFLPrVVLI 

FKSILFLPCLRKKEJORHGWEDVTKINKTEICSQL 


3476 


A 


143 


3191 


AKAPPTGESSEPEAKVlin'KRLYRAVVEAVHRL 

DLILCNKTAYQEWKPENISLRNKLRELCVKLMF 

LHPVDYGRKAEEIXWRKVYYEVIQLIKTNKKHI 

HSRSTLECAYRTHLVAGIGFYQHLLLYIQSHYQL 

ELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQ 

MACHRCLVYLGDLSRYQNELAGVDTELLAERFY 

YQALSVAPQIGMPFNQLGTLAGSKYYNVEAMY 

CYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanlne OCysteine, D=Aspartic Add, 
E=GIutamic Add, ^Phenylalanine, G=Glydne, H^Histidine, 
Msoleudne, KpLyslne, L=Leudne, M^Metnionine, 
N-Asparagine, P^Proline, Q==G!atamine, R-Arginine, S=Serine, 
l^Threonine, V=Vaiine, W^Tryptophan, Y=ayrosine, 
X=Unknown, *=Stop codon, /^possible nudeotide deletion, 
V=possible nudeotide insertion 










kkcetrki^pgkkrckdikrijlvnfniyix^ 

pksssvdseltsu:qsvi^fnlclfylpsspnls 

lasedeeeyesgyaflpdllifqmvijclmcvhsl 

eragskqysaaiaftl^fshlvnhvnirlqael 

eegenpvpafqsdgtdepeskepvekeeepdpepp 

pvtpqvgegrksrkfsrlsclrrrrhppkvgdds 

dlsegfesdsshdsarasegsdsgsdkslegggt 

afdaetdsemnsqesrsdledmeeeegtrsptle 

pprgrseapdslngplgpseasiasnlqamstqm 

fqtxrcfriapttsot^llqpttothtsashrpcv 

ngdvdkpsepaseegsesegsessgrscrnersiq 

eklqvlmaeglijavkvfldwijitnpdlnvca 

qssqslwnrlsvllnllpaagelqesglalcpev 

qdllegcelpdij^sixlpedmaijwlpplraah 

rrfnfdtdrpixstleesvvriccnisfghfiarlq 

gsilqfnpevgifvsiaqseqesllqqaqaqfrma 

qeearrnrlmrdmaqlrlqlevsqlegslqqpk 

aqsamspylvpdtqalchhlpvirqlatsgrfivi 

iprtvmgldllkkehpgardgiryleaefkkgn 

ryircqkevgksferhklkrqdadawtlyk1ld 

sckqlt^aqgageedpsgmvtntglpldnpsvl 

sgpmqaalqaaahasvdiknvldfykqwkeig 


3477 

t 
I 


A 


1 


3902 


MTEPRERRGYSVPPRPEVGTQATEWRVEESNFN 

KIFLKKDAELGRSNHLPTWDKPEDASWLPQSCL 

GGDAVATTGEIHEEKAWKTRALEVGQPAQRDIR 

RGELWGKEHGADQAIQETLEDLSSLERTLWSES 

SPLGGDCQEVTTLTVKYQVSEEVPSGTVIGKLSQ 

ELGREERRRQAGAAFQVLQLPQALPIQVDSEEGL 

LSTGRRLDREQLCRQWDPCLVSFDVLATGDLALI 

HVEIOVT.DINDHQPRFPKGEQELEISESASLRTRIP 

ldra; * '-^rrrm ■■ttjt" -~ .T*THFALr vivTrr 

ETKHi' JELIWKELDRE: SPtZx* XTAYDNGNPP 

KSGTSLVKVNVLDSNDNi?; AFAESSLALEIQEDA 

APGTLLKLTATOPIX^PNGBV1^LSKHMPPE\V 

LDTFSIDAKTGQVILRRPLDYEKIvrT AYEVDVQAR 

DIXJFNPIPAHCKVLIKV^ 

SLVSEALPKDSFIALVMADDLDSGNNGLVHCWL 

SQELGHFRLKRTNGNTYMLLTNATLDREQWPK 

YTLTLLAQDQGLQPLSAKKQLSIQISDINDNAPVF 

EKSRYEVSTRENNLPSLHLITIKArlDADLGINGK 

VSYRIQDSPVAHLVAIDSNTGEVTAQRSLNYEEM 

AGFEFQVIAEDSGQPMLASSVSVWVSLLDANDN 

APEWQPVLSDGKASLSVLVNASTGHLLVPIETP 

NGLGPAGTDTPPLATHSSRPFLLTITVARDADSG 

ANGEPLYSIRSGNEAHLFILNPHTGQI^VimNA 

SSLIGSEWELEIVVEDQGSPPLQTRALLRVMFVTS 

VDHLRDSARKPGALSMSMLTVICLAVLLGIFGLI 

LALFMSICRTEKKDNRAYNCREAESTYRQQPKR 

rXJKHIQKADIHLVPVTJIGQAGEPCEVGQSHKDV 

DKEAMMEAGWDPCLQAPFHLTPTLYRTLRNQG 

NQGAPAESREVLQDTVNIXFNHPRQRNASRENL 

NLPEPQPATGQPRSRPLKVAGSPTGRLAGDQGSE 

EAPQRPPASSATLRRQRHLNGKVSPEKESGPRQI 

LRSLVRLSVAAFAERNPVEELTVDSPPVQQISQLL 

SLLHQGQFQPKPNHRGNKYLAKPGGSRSAIPDTD 



351 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 PCT7US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
locatioo 
corresponding 
to last amino 
. add residue of 
peptide 
sequence 


Amino acid sequence (A^Alantne OCysteine, D=Aspartic Add, 
E=Glutamk Add, F^Pbenylalanlne, G=Gr/dne, H=HJstidine, 
Msoleudne, K«LysJne, l^Leudne, M=Methionine, 
N=Asparagine, P-ProUne, Q-Glutaraine, R-Arginine, §=Serine, 
T^Threonine, V«Valine, W^Oryptophan, Y»Tyroslnc, 
X=Unknown, *=Stop eodon, /^possible nudeotide ddetion, 
V=possible nudeotide insertion 










GPSARAGGQTDPEQEEGPLDPEEDLSVKQLLEEE 

LSSLLDPSTGLALDRLSAPDPAWMARLSLPLTTN 

YRDNVISPDAAATEEPRTFQTFGKAEAPELSPTG 

TRLASTFVSEMSS1XEMLIJEQRSSMPVEAASEAL 

RRLSVCGRTLSLDLATSAASGMKVQGDPGGKTG 

TEGKSRGSSSSSRCL 


3478 


A 


13 


1620 


TLPPPGNSGCHRLCFPEFEFLQVTKMEFSGRKWR 

KUU^GDQRNASYPHCLQFTLQPPSENISLIEFEN 

LAIDRVKLLKSVENLGVSYVKGTEQYQSKLESEL 

RKLKFSYREhn^EDEYEPRRRDHISHFILRLAYCQS 

EELRRWHQQEMDLLRFRFSILPKDKIQDFLKDSQ 

LQFEAISDEEKTLREQEIVASSPSLSGLKLGFESIY 

KIPFADALDLFRGRKVYLEDGFAYVPLKDIVAIIL 

NEFRAKLSKALALTARSLPAVQSDERLQPLLNHL 

SHSYTGQDYSTQGNVGKISLDQIDLLSTKSFPPC 

MRQLHKALRENHHLRHGGRMQYGLFLKGIGLT 

LEQALQFWKQEFKGKMDPDKFDKGYSYNIRHS 

FGKEGKRTDYIPFSCLKIILSNPPSQGDYHGCPFR 

HSDPELLKQKLQSYKISPGGISQILDLVKGTHYQ 

V\ACQKYFEMIHTVDDCGFS\LSHPNQYFCESQRI 

LNGGKD1XKEPIQPETPQPKPSVQKTKDASSALA 

SLNSSLEMDMEGLEDYFSEDS 


3479 


A 


698 


138 


RPELELWRLRSRSWRPLGVPRRCHRRNWKEPVR 

AQPLSVTVWAPRCQRP/QPPAPEPSSPNAAVPEAI 

PTPRAAASAALELPLGPAPVSVAPQAEAEARSTP 

GPAGSRLGPETFRQRFRQFRYQDAAGPREAFRQL 

REL/SPRQWLRPDI\RTKEQ\IVEMLVQEQLLAELP 

EAARARRIRRRTDVRITG 


3480 


A 


117 


2226 


RRGSRSRGPFAEPAAPGGLCSSSEEKIEEGGMAV 
PLCKAMS(X3LVTFRDVALDFSQEEWEWLKPSQ 

QK£t v;^MVERKMSQGHCADWESWWiiIEELv : ':i 
T.FIDEDEISQENfVMERIASHGLECSSF 
KGEI^XHQGNAERHmQVTAVKEISTGKRDNEF 
SN/IWEIOnPEISIFT^^ 

PQNSVEEYKRLHAEKESLIGNECEEFNQSTYLSK 

DIGIPPGEKF/ESHDFSKLLSFHSIJ^QHQTTHFG 

KLPHGYDECGDAFSCY SFFTQPQRIHSGEKP YAC 

NDCGKAFSHDFFLSEHQRTHIGEKPYECKECNKA 

FRQSAH1AQHQRIHTGEKPFACNECGKAFSRYAF 

LVEHQRIHTGEKPYECKECNKAFRQSAHLNQHQ 

RIHTGEKPYECNQCGKAFSRRIALTLHQRIHTGE 

KPFKCSECGKTFGYRSHLNQHQRIHTGEKPYECI 

KCGKFFRTDSQLNRHHRIHTGERPFECSKCGKAF 

SDALVLIHHKRSHAGEKPYECNKCGKAFSCGSY 

LNQHQRIHTGEKPYECSECGKAFHQILSLRLHQRI 

HAGEKPYKCNESQRVRRSELAVSRGLTTKPADT 

GPDSTLNAAKVAEPARAGTEAALRPALSVAESA 

TSLGPLHQGRRFPEAPAAHPGGTGFTVCAS 


3481 


A 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEAEG 
RLREKLFSGYDSSVRPAREV GDRVRVS VGLILAQ 
LISLNEKDEEMSTKVYIJDIJEWTDYRLSWDPAEH 
DGIDSLRITAESVWLPDWLLNNNDGNFDVALDI 
SVWSSDGSVRWQPPGIYRSSCSIQVTYFPFDWQ 
NCTMVFSSYSYDSSEVSLQTGLGPDGQGHQEIHI 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nudeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nudeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanine OCysteine, D=Aspartic Add, 
E=Glutnmic Add» ^Phenylalanine, G^GIydne, H=Histidine, 
iBlsolcodne, K=Lysine, LHLeudne, M"Metfaionlne» 
N=Asparagine, P-ProJine, Q=Glutamine, R=»Arguiifle, S^Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=po3sible nudeotide deletion, 
V=possibie nudeotide Insertion 










HEGTTIENGQWENIHKPSRLIQPPGDPRGGREGQ 
RQEVIFYLHRRKPLFYLVNVTAPCIUTI^^ 
l^PDAGEKMGLSIFAIXTLTVFLLLLADKVPETSL 
SVPIIIKYLMFIMVLVTO 

THQMPLWVRQIF1HKIJPLYLRIJCRPKPERDLMPE 
PPHCSSPGSGWGRGTDEYHRKPPSDFLFPKPNRF 
QPELSAPDLRRFIDGPNRAVALLPELREWSSISYI 
ARQLQEQEDHDAUOEDWQFVAMVVDRLFLWTF 
HFTSVGTLVVIFLDATYHLPPPDPFP 


3482 


A 


1273 


172 


ERWDSGGADAEWYALADWTAVWLPRSDFYTR 

LQTGEGHVPALRLPAGMPPDSPRELVPKQAPCSP 

SDPALPWTLGHGNQPPAWPEPQGPMGPAGVAA 

RPGRFFGVYIXYCLNraYRVRNVYVGFTV^ 

VQQHNGGRKKGGA\GRTSGRGPWEMVLWHGF 

PSSVAALRFEWAWQHPHASRRLAHVGPRLRGET 

AFAFHLRVlJ\HMIJUVPPWARLPLTLRWVRPDLR 

QDLCLPPPPHVLLAFGPPPAQVPRPQRRRAGPFD 

DAEPEPDQGDPGACCSLCAQTIQDEEGPLCCPHP 

GCLIJU^HVICLAEEFLQEEPGQLLPLEGQCPCCE 

KSLLWGDLIWLCQMDTEKEVEDSELEEAHWTD 

LLET 


3483 


A 


230 


3686 


WRPWPCEDTSWNLQVAARTLRVSSAQCGLVPT 
MARVESPVPAARASLTGSCVLGQAMPLRGGAGP 
SPASHGPTHGPSDPRTCLPGRGAGGMRPHGRGA 
LGCCGLCSFYTCHGAAGDEIMHQDIVPLCAADIQ 
DQLKKRFAYLSGGRGQDGSPVTTFPDYPAFSEIPD 
KEFQNVMTYLTSIPSLQDAGIGFILVIDRRRDKW 
TSVKASVLRIAASFPANLQLVLVLRPTGFFQRTLS 
DIAFKFNRDDFKMKVPVIMLSSVPDLHGYIDKSQ 
LTEDLGGTLDYCHSRWLCQRTAIESFALMVKQT 
. AQMLQSF J Ili AHTTIFND^C 

NQDQLDNQATVQRLLAQLNH". J AAFDEFWAKH 

QQKLEQCLQIJtflFEQGFREVKAII J WASQKIATF 

TDIGNSIAHYEHLLRDLANFQEKSG VTVERARA 

LSLTASSFIGNKHY A VDSIRPKCQELRHLCDQFS A 

EIARRRGLXSKSIJELHRRIJETSMKWCDEGrYLLA 

SQPVDKCQSQIXJAEAALQEIEKFLErGAENKIQE 

LNAIYKEYESBLNQDLMEHVRKVFQKQASMEEV 

FHRRQASLKKXAARQTRPVQPVAPRPEALAKSP 

CPSPGIRRGSENSSSEGGALRRGPYRRAKSEMSES 

RQGRGSAGEEEESLAII^RHVMSELLDTERAYVE 

ELIXTVI^GYAAEMDNPLMAHIXSTGLHNKKDV 

IJGNMEEIYHFHNRIFL^ 

LERMEDFQIYEKYCQNKPRSESLWRQCSDCPFFQ 

ECQRKLDHKLSLDSYLLKPVQRTTKYQLLLKEM 

LKYSRNCEGAEDLQEALSSILGILKAVNDSMHLI 

AITGYDGNLGDLGKIJJ4QGSFSVWTDHKRGHT 

KVKELARFKPMQRHLFLHEKAVLFCKKREENGE 

GYEKAPSYSYKQSLNMAAVGITENVKGDAKKFE 

IWYNAREEVYWQAFITEIKAAWVNEIRKVLTSQ 

LQACREASQHRALEQSQSLPLPAPTSTSPSRGNSR 

NIKKLEERKTDPLSLEGYVSSAPLTKPPEKGKGW 

SKTSHSLEAPEDDGGWSSAEEQINSSDAEEDGGL 

GPKKLVPGKYTWADHEKGGPDALRVRSGDW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AlanIne OCysteine, ]>=>Aspartic Add, 
E=GIutflmic Add, ^Phenylalanine, G^GIydne, H^HUtidine, 
I^Isoleudne, K=Lysine, L/=Leadne, M^Methionine, 
N-Asparagfre, P=Proline, Q=dutamine, R^Arginine, S=Serine, 
T=Threonine, V«VaIine, W^Tryptopnan, Y^Tyrosine, 
X»Unknown, *=Stop codon, /possible nucleotide deletion, 
\~possible nndeotide insertion 










ELVQEGDEGLW 


3484 


A 


208 


6103 


VTMAQQAADKYLYVDKNFINNPLAQADWAAK 

KLVWVPSDKSGFEPASLKEEVGEEAIVELVENGK 

KVKVNKDDIQKMNPPKFSKVEDMAELTCLNEAS 

\^H>nJCERYYSGLIYTYSGLFCVVINPYKNlJ*IYS 

EEIVEMYKGKKRHEMPPHIYAITDTAYRSMMQD 

REDQSILCTGESGAGKTENTKKVIQYLAYVASSH 

KSKKDQGELERQLLQANPILEAFGNAKTVKNDN 

SSRFGKFIRINFDVNGYIVGANIETYLLEKSRAIRQ 

AKEERTFHIFYYU^GAGEHIJK.TDLLLEPYNKYR 

FLSNGHVTIPGQQDKDMFQETMEAMRIMGIPEEE 

QMGLLRVISGVLQLGNIVFKKERNTDQASMPDN 

TAAQKVSHLLGINVTDFTRGILTPRIKVGRDYVQ 

KAQTKEQADFAIEALAKATYERMFRWLVLR1NK 

ALDKTKRQGASFIGILDIAGFErroi^SFEQIXnW 

TOEKLQQLFNHTMFILEQEEYQREGIEWNFIDFG 

UJLQPCIDLIEkPAGPPGILALLDEECWFPKATDK 

SFVEKVMQEQGTHPKFQKPKQLKDKADFCIIHY 

AGKVDYKADEWLMK>JMDPLNDNIATLLHQSSD 

KFVSELWKDVDRIIGLDQVAGMSETALPGAFKT 

RKGMFRWGQLYKEQIjyCIJvL^TLRNTOT 

CIIPNHEKKAGKIJ)PHLVIJ5QLRCNGVLEGIRICR 

(^FPNRVVFQEFRQRYEILIPNSIPKGFMDGKQA 

CVLMIKALELDSNLYRIGQSKVFFRAGVLAHLEE 

ERDLKITDVnGFQACCRGYLARKAFAKRQQQLT 

AMKVLQRNCAAYLKLRNWQWWRLFTKVKPLL 

QVSRQEEEMMAKEEELVKVREKQLAAENRLTE 

METLQSQLMAEKLQLQEQLQAETELCAEAEELR 

ARLTAK\KQ\ELEEICHDLEARVEEEEERCQHLQA 

EKKKMQQN1QELEEQLEEEESARQKLQLEKVTT 

"AJ l ":r-,EEECIILLrO XT/ ^KIJaCEI^XO^UEF i 

T 'NLT^aEK^^ 

EEK ^RQELEKTRRKLEGDSTDLSDQIAELQAQMA 

ELKNiQ- AKKEEELQAALARVEEEAAQKNMALK 

KIRELESQISELQEDUCCERVASRNKAEKQKRDLG 

EELEALKTELEDTLDSTAAQQELRSKREQEVNEL 

KKTLEEEAKTHEAQIQEMRQKHSQAVEELAEQL 

EQTKRVKANLEKAKQTI^NERGEIANEVK^ 

GKGDSEHKRKKVEAQLQELQVKFNEGERVRTEL 

ADKVTKLQVELDNVTGLLSQSDSKSSKLTKDFS 

AI^QLQDTQEUXJEENRQKLSLSTKLKQVEDE 

KNSWREQLEEEEEEAKTO^EKQIATLHAQVADM 

KKKMEDSVGCLETAEEVKRKLQKDLEGLSQRHE 

EKVAAYDKLEKTKTRLQQELDDLLVDLDHQRQ 

SACNl^KKQKKFDQIJLAEEKTISAKYAEERDRA 

EAEAREKETKALSLARALEEAMEQKAELERLNK 

QFRTEMEDLMSSKDDVGKSVHELEKSKRAIEQQ 

VEEMKTQI^ELEDELQATEDAKLRLEVNLQAM 

KAQFERDLQGRDEQSEEKKKQLVRQVREMEAE 

LEDERKQRSMAVAARKKLEMDLKDLEAHIDSA 

NKNRDEAIKQLRKLQAQMKDCMRELDDTRA!5R 

EEBLAQAKENEKKLKSMEAEMIQLQEELAAAER 

AKRQAQQERDELADEIANSSGKGALALEEKRRL 

EARIAQLEEEI^EQGNTELINDRIJKKANLQIDQI 

KTOLNIJSRSHAQKNEKARC^I^RQNKELKVKL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

ODcleotide 

location 

corresponding 

to first amino 

tcid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=-AIanine OCystdne, INAspartk Add, 
E=GIutamie Add, ^Phenylalanine, G*=Glydne, H^HistidJne, 
I^Isoleudne, K«=Lyslne, L=Leudne, M=M ethionioe, 
N«Asparagine, P^ProUne, Q-Glutamine, R=Argliilne, S=Scrine, 
T^Threonine, V°Vallne, W-Tryptopnan, Y=Tyrosiue, 
X»Unknown, *=Stop codon, /=possibIe nncleotide deletion, 
V=possible nudeotide Insertion 










QEMEGTVKSKYKASITALEAKIAQLEEQLDNETK 

ERQAACKQVRRTEKKLKDVLLQVDDERRNAEQ 

YKDQADKASTRLKQLKRQLEEAEEEAQRANASR 

RKLQRELEDATETADAMNREVSSLKNKLRRGDL 

PFVVPRRMARKGAGDGSDEEVDGKADGAEAKP 

AE 


3485 


A 


2 


1782 


CSTGVSKAPLTYLMSYGFELGWRKGNRAVACR 

EDRGGESVGMGQESILSQVHWWEAEPVEKTPGR 

DSEATIMSLRVHTUTTliGAVVRPGCRELLCLLM 

ITVTVGPGASGVCPTACICATDIVSCTNKNLSKVP 

GNUTUJKRLDI^YNRIGLLDSEWIPVSFAiaNTL 

ILRHNNITSISTGSFSTTPNLKCLDLSSNKLKTAVK 

NAVFQELKVLEVLLLYNNHISYLDPSAFGGLSQL 

QKLYI^GNFLTQFPMDLYVGRFKLAELMFLDVS 

YNRPSMPMHHINLVPGKQLRGIY1JIGNPFVCD\ 

CSLVSLLVFWYRRHFSSVMDFKNDYTCRLWSDS 

RHSRQVLLLQDSFMNCSDSHNGSFRALGFIHEAQ 

VGERLMVHCDSKTGNANTDFIWVGPDNRLLEPD 

KEMENFYVFHNG SLYCESPRFEDAG VYSCIAMNK 

QRLLNETVDVTINVSNFTVSRSHAHEAFNTAFTT 

IJVACVASIVLVLLYLYLTPCPCKCKTKRQKNML 

HQSNAHSSILSPGPASDASADERKAGAGKRWFL 

EPLKDTAAGQNGKVRLFPSEAVIAEGILKSTRGK 

SDSDSVNSVFSDTPFVAST 


3486 


A 


357 


1173 


GDPRETKVFPSRSFARNTVGVSHHQSHLFHTVSR 

IYVEDKHKILYCEVPKAGCSNWKRILMVLNGLA 

SSAYNISHNAVHYGKHLKKLDSFDLKGIYTRLDT 

YTKVLVLVRDPMERLVSAFRDKFDHPNSYYHPVF 

GKAHKKYRFNACEEALINGSGVKFKEFIHYLLDS 

HRPVGMDIHWEKVSVLCYPCLINYDFVGKFETL 

EETANYFLOMGAPK ^XPP^^RHr^r:- " "! 7A 

QV VRQY^ ^LTOTEKQLIYDFYYLD\1 : IV ts i 

PFL 


3487 


A 


.2 


3281 


CDKSGAVPFSTTRSPRRPSPRSAGPSLSSVSPx^S > 

LWASSGLSEEHAAPLLPAWPRHPCPPSLTPGPSM 

AQGAMRFCSEGDCAISPPRCPRRWLPBGPVPQSP 

PASMYGSTGSLLRRVAGPGPRGRELGRVTAPCIP 

LRGPPSPRVAPSPWAPSSPTGQPPPGAQSSWIFR 

FVEKASVRPLNGLPAPGGLSRSWDLGGVSPPRPT 

PALGPGSNRKLRLEASTSDPLPARGGSALPGSRN 

LVHGPPAPPQVGADGLYSSLPNGLGDPPERLATL 

FGGPADTGFLNQGDTWSSPREVSSHAQRIARAK 

WEFFYGSLDPPSSGAKPPEQAPPSPPGVGSRQGS 

GVAVGRAAKYSETDLDTVPLRCYRETDIDEVLA 

EREEADSAIESQPSSEGPPGTAYPPAPRPGPLPGP 

HPSLGSGNEDEDDDEAGGEEDVDDEVFEASEGA 

RPGSRMPLKSPVPFLPGTSPSADGPDSFSCVFEAI 

LESHRAKGTSYTSLASLEALASPGPTQSPFFTFEL 

PPQPPAPRPDPPAPAPLAPLEPDSGTSSAADGPWT 

QRGEEEEAEARAKLAPGREPPSPCHSEDSLGLGA 

APLGSEPPLSQLVSDSDSELDSTERLALGSTDTLS 

NGQKADLEAAQRLAKRLYRLDGFRKADVARHL 

GKNNDFSKLVAGEYLKFFVFTGMTLDQALRVFL 

KELALMGETQERERVLAHFSQRYFQCNPEALSSE 

DGAHTLTCALMLLNTDLHGHNIGKRMTCGDFIG 
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StiQlD 
NO: 


Method 


Predicted 

beginning 

andeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nudeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanlne OCystdne, D^Aspartic Add, 
B=CIutamlc Add, ^Phenylalanine, G=€rydne, HMHistidine, 
I»IsoJeudne, K=Lysine t LHLeudne, MNMethlonine, 
N^Asparagine, peproline, Q^GIutamine, R=Arginine, S=Serine* 
T-Threonine, V»Valine, W-Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop eodon, ^possible nudeotide deletion, 
V=possible nudeotide insertion 










NLEGLNDGGDFPRELLJKALYSSIKKEKLQWAIDE 

EELRRFLSELADPNPKVIKRISGGSGSGSSPFLDLT 

PEPGAAVYKHGALVRKVHADPDCRKTPRGKRG 

WKSTOGILKGMILYLQKEEYKPGKALSETELKN 

AISIHHAIAT1USVOTSKRPHVFYLRTADWRVFL 

FQAPSLEQMQSWITRINVVAAMFSAPPFPAAVSS 

QKKFSRPLLPSAATRLSQEEQVRTHEAKLKAMA 

SELREHRAAQLGKKGRGKEAEEQRQKEAYLEFE 

KSRYSTYAALLRVKLKAGSEELDAVEAALAQAG 

STEDGLPPSHSSPSLQPKPSSQPRAQRHSSEPRPG 

AGSGRRKP 


3488 


A 


441 


1968 


GTETPHCWGRGTAGLRRELDREERDGPGTATMS 

FPHFGHPYRGAFQFLVASASSSTTCCESTLRSVSY 

VASGSTPAPALCCAPVYDSRLLGSARPELGAALGI 

YGAPYAAAAAAQSYPGYLPYSPEPPSLYGALNP 

QYEFKEAAGSFTSSLAQPGAYYPYERTLGQYQY 

ERYGAVELSGAGRRKNATRETTSTLKAWLNEHR 

KWYPTKGEKIMI^nTKMTLTQVSTWFANARRR 

LKKENKMTWAPKNKGGEERKAEGGEEDSLGCL 

TADTKE VTA SQEARGLRLSDLEDLEEEEEEEEEA 

EDEEWATAGDRLTEFRKGAQSLPGPCAAAREG 

RLERRECGLAAPRFSFNDPSG SEEADFLSAETGSP 

RLTMHYPCLEKPRIWSL AHTATASA VEG APPARP 

RPRSPECRMIPGQPPASARRLSVPRDSACDESSCI 

PKAFGNPKFALQGLPLNCAPCPRRSEPWQCQYP 

SGAEGSGPPAALGVSMQKTPTYRPARQLHTLCH 

SSLP 


3489 


A 


718 


2073 


IAAYHKALSYRGHVHANNRGTNNVHFTPPPSPS 

RGE^MNPRNMMNHSQVGQGIGIPSRTNSMSSSG 

LGSPNRSSPSUCMPKQQPSRQPFTVNSMSGFGMN 

r,NQ^GKINN3I ?SNlFNGTDGSEN^ ..: -1 "131^? 

ALADRNRREGSGNPTiV v IPLAGRAPYVGMVTK 

PANEQSQDFSIHNEDFPALPGSSYKDPTSSNDDSK 

SNLKTSGKTTSSTDGPKFPGDKSSTTQNNNQQKK 

GIQVLPDGRVTNIP^MVTTKJFGMIGIXTFIRAA 

EIDPGMVrnALGSDLTTLGLNLNSPENLYPKFAS 

PWASSPCRPQDIDFHVPSEYLTNIHIRDKLFFFFS 

W/TADCLGRYGEDLLFYLYYMNGGDVLQLLAAV 

ELFMIDWRYHKEERVWITRAPGMEPTMKTNTY 

ERGTYYFFIX^LNWRKVAKEFHLEYDKLEERPHL 

PSTFNYNPAQQAF 


3490 


A 


2 


2833 


FVAKMATSQYFDFAQGGGPQYSTQAPTLPLPTV 

GASYTGQP1PGMDPAVNPAFPPAAPAGYGGYQP 

HSGQDFAYGSRPQEPVPTATTMATYQDSYSYGQ 

SAAARSYEDRPYFQSAALQSGRMTAADSGQPGT 

QEACGQPSPHGSHSHAQPPQQAPIVESGQPASTL 

SSGYTYPTATGVQPESSASIVTSYPPPSYNPTCTA 

YTAPSYPNYDASVYSAASPFYPPAQPPPPPGPPQ 

QLPPPPAPAGSGSSPRADSKPPLPSKLPRPKAGPR 

QIXJIJEIYCDICKISCAGPQTYREHLGGQKHRKKE 

AAQKTGVQPNGSPRGVQAQLHCDLCAVSCTGA 

DAYAAHIRGSKHQKVFKLHAKLGKPIPTLEPALA 

TESPPGAEAKPTSPTGPSVCASSRPALAKRPVASK 

ALCEGPPEPQAAGCRPQWGKPAQPKLEGPGAPT 

QGGSKEAPAGCSDAQPVGPEYVEEVFSDEGRVL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residoe of 
peptide 
sequence 


Amino acid sequence (A-Alanlne OCysteine, D=Aspartic Add, 
E^Glutamlc Add, ^Phenylalanine, G=GIydne, EMBUstidine, 
I=Isoleudne, KpLysine, L^Leudne, M^Methionine, 
N=Asparagint» P=Proline, Q=GlutHmine, R-=Arginine, S=Serine, 
Threonine, V«VaIine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *«$top codon, A-possible nudeotide deletion, 
V^possible nudeotide insertion 










RFHCKLCECSFNDLNAKDlJi^ 

VNPDLPIATEPSSRARKVl^ERMRKQIGILAEERL 

EQlJtRWHAERRRLEEEPPQDVPPHAPPDWAQPL 

IJ4GRPESPASAPLQPGREPASSDDRHVMCKHATI 

YPTEQELLAVQRAVSHAERALKLVSDTLAEEDR 

GRREEEGDKRSSVAPQTRVLKGVMRVGILAKGL 

IXRGDRNVRLAIXCSEKPTHSLLRRIAQQLPRQL 

QMVTEDEYEVSSDPEANIVISSCEEPRMQVTISVT 

SPLMREDPSTDPGVEEPQADAGDVLSPKKCLESL 

AALRHARWFQARASGLQPCVIVmVLRDLCRRV 

PTVWGALPAWAMELLVEKAVSSAAGPLGPGDAV 

RRVLECVATGTLLTDGPGLQDPCERDQTDALEP 

MTLQEREDVTASAQHALRMLAFRQTHKVLGMD 

LLPPRHRLGARFRKRQRGPGEGEEGAGEKKRGR 

RGGEGLV 


3491 


A 


2 


1321 


FVGDGALSGCRRGRAPRVPSMAGSLPPCWDCG 

TGYTKLGYAGNTEPQFIIPSCIAIRESAKVVDQAQ 

RRVLRGVDDLDFFIGDEAIDKPTYATKWPIRHGII . 

EDWDLMERFMEQWFKYLRAEPEDHYFLMTEP 

PLNTPENREYIAEIMFESFNVPGLYIAVQAVLAL 

AASWTSRQVGERTLTGIVIDSGDGVTHVIPVAEG 

YVIGSCIKHIPIAGRDITYFIQQLLREREVGIPPEQS 

LETAKAIKEKYCYICTDIVKEFAKYDVDPRKW1K 

QYTGINAINQKKFVIDVGYERFLGPEIFFHPEFAN 

PDFMESISDWDEVIQNCPIDVRRPLYKNWLSG 

GSTMFRDFGRRLQRDLKRWDARLRLSEELSGG\ 

RIKPKPVEVQWTHHMQRYAV\WFGG\SMLASTP 

EFFQVCHTKKDYEEYGPSICRHNPVFGVMS 


3492 


A 


3 


2024 


PNGVALLHLPGAAVIPNTNYMFQDALGGRSRGS 
REESPAPSRAPASASLWRRLVWEAKMAAHAAA 

BIJPIOELCVilCLQAVFPFKPPQiUEAR ; i^QLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 

V/ilCRLLFQI^QLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFLLSKGMLLLMERICLQEVHPL 

LTLCGQIVEWQGNPIQKESLRWFLVl^VTrm. 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADIJ1IWIJ>KEHMCVLVYLVTVMHSMQAGYLE 

KAQKYIDKALMQLEKLKMLDCSPILSSFQV1LLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQIJiTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWArWTNIASVYIREGNRHQEVAALYS 

IXERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGH1 

F'mXjNHRESNNMVWAMQLASKPDMSVQLW 

SSAIXRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3493 


A 


3 


2024 


PNGVALUn.PGAAVIPNTNYMFQDALGGRSRGS 
REESPAPSRAPASASLWRRLVWEAKMAAHAAA 
AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 
SPPKIRIX^VHCLQAVFPFKPPQRIEARTrlLQLGSV 
LYHHTKNSEQARSHLEKAWLISQQPQFEDVKFE 
AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 
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SEQGO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=>Alanine OCysteine, D°Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, OGlycine, H=Hlstidlne, 
Msoleudne, KHLysine, L^Leudne, M"»Methionroe, 
N^Asparagine, P^Proline, Q=Gintamine, R»Arginine, S=Serijie, 
T^Threonine, V=Valinc, W«Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, ^possible nudeotide deletion, 
V=possible nudeotide insertion 










WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFIXSKGMIJLLMERKLQEVrIPL 

LTLCGQrVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWIJ>KEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HDMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LT^QELWAFIVTKIASVYIREGNRHQEVV\LYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLREILKMSNAEDLKRLTACSLVLLGHI 

FYVIXJNHRESNNMVWAMQIASKPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNUTWTDGPPPVQFQAQNGPN 

TSLASLL 


3494 


A 


2 


1615 


VIAGQRGPAGGLAEERRRGRNEWR3HDVTTAPF 

IKjLVQRKJSRLUVSQVRYFLKNKVSPDLCNEDGL 

TA1J1QCCJDNFEEIVKLLLSHGANVNAICDNELW 

TPLHAAATCGHINLVKILVQYGADLLAVNSDGN 

MPYDLCEDEPTLDVIETCMAYQGITQEKINEMRV 

APEQQM1ADIHCMIAAGQDLDWIDAQGATLLHI 

AG AN G YLRAAELLLDHG WVD VKI) WE)G WEPL 

HAAAFWGQMQMAELLVSHGAN\LNARTSMDE 

MPIDLCEEEEFKVLIXEOCVHKHDVIMKSQLRHK 

SSLSRRTSHRQAS/SVGKWRRTQPVGTGPNL\YR 

KEYE/GEEAILWQRSAVAEDQRTSTYNGDIRETyR 

TDQENKDPNPRLEK\PVIXSEFPTKIPRGELDMPV 

ENGLRAPVSAYQYALANGDVWKVHEVPDYSM 

AYGNPGVADATPPWSSYKEQSPQTLLELKRQRA 

AAKLLSHPFLSTHLGSSMARTGESSSEGKAPLIG 

EEKV; jv.tc^ixl: 


3495 


A 


327 


1078 


APMADf PNGPQGAGAVQFMMTNKLDTAMWL 

SRLFTVYCS/ t lFVIJE^GLHEAASFyQRAIXANA 

LTSALRUIQm ? riFQLSRAFl^QALLEDSCHYLL 

YSLIFVNSYFVnTMSIFPVIXFSLIJHAATYT^ 

DARG\SNSLPLIJttSVIJ}KLSAN 

FLMPATVFMU?SGQGSLLQPFIYYRFLTLRYSSRR 

NPYC^TLFNELRIVVEHnMKPACT 

1AFISRLAPTVP 


3496 


A 


3 


2867 


SSRTREMEEKEIUaiQIRIXQGLroDYKTLHGNAP 

APGTPAASGWQPPTYHSGRAFSARYPRPSRRGYS 

SHHGPSWRKKYSLVNRPPGPSDPPADHAVRPLH 

GARGGQPPWQQHVLERQVQLSQGQNVVIKVKP 

PSKSGSASASGAQRGSLEEFEDTPWSDQRPREGE 

GEPPRGQLQPSRPTRARGTCSVEDPLLVCQKEPG 

KPRMVKSVGSVGDSPREPRRTVSESVIAVKASFP 

SSALPPRTGVALGRKLGSHSVASCAPQLLGDRRV 

DAGHTDQPVPSGSVGGPARPASGPRQAREASLV 

VTCRTNKFRKNNYKWVAASSKSFRVARRALSPR 

VAAE^CKASAGMANKVEKPQLIADPEPKPRKP 

ATSSKPGSAPSKYKWKASSPSASSSSSFRWQSEA 

GSKDHASQLSPVLSRSPSGDVRPALAHSGLKPLSG 

ETPI^AYKVKTOTKtlRRRGSTSLPGDKKSGTSPA 

ATAKSHLSLRRRQALRGKSSPVLKKTPNKGLVQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to tost amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»AIanfne OCysteine, D=-Aspartic Add, 
E=Glutamie Add, ^Phenylalanine, OCtydne, BHEDsddltte, 
I^Isoleadne, KpLysf ne, l/=Lencine, ^Methionine, 
N»Asparagine, P^Proline, Q^Glutamine, R=Arginiiie, S= 3 Serine, 
T^Thrconine, V^Valine, W«Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=p05sibie nudeotide insertion 










VTKHRlXDRIJPSRAHUrrXEA 

VIKTOYRIVKKTPASPLSAPPFPI^LPSWRAWII^ 

LSRSLVLNRLRPVASGGGKAQPGSPWWRSKGYR 

CIGGVLYKVSANKLSKTSGQPSDAGSRPLLRTGR 

LDPAGSCSRSLASRAVQRSLAIIRQARQRREKRK 

BYCMYYNRFGRCNRGERCPYIHDPEKVAVCTRF 

VRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGI 

CSNSNCPYSHVYVSRKAEVCSDFLKGYCPLGAK 

CKKKHTIXCPDFARRGACFRGAQCQLLHRTQKR 

HSRRAATSPAPGPSDATARSRVSASHGPRKPSAS 

QRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSS 

KASSSSSSSSSPPASLDHE\APSLQEAALAAACSN 

RLCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDSG 

KPLHIKPRL 


3497 


A 


1586 


141 


ATARDLGCARRIDRWMESTPSRGLNRVHLQCR 

NLQEFLGGl^PGVLDRLYGHPATCLAVFRELPSL 

AKNWVMRMLFLEQPLPQAAVALWVKKEFSKA 

QEESTGLLSGLR1WHTQLLPGGLQGLILNPIFRQN 

LRIALLGGGKAWSDDTSQLGPDKHARDVPSLDK 

YAEERWEVVLHFMVGSPSAAVSQDLAQLLSQA 

GLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFM 

LQYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYS 

VEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYP 

T/RAIAINLSSGVSGAGGTVHQPGFIV\VBTNYRL 

YAYTESELQIALIALFSEMLYPFP\NMW\ARVTR\ 

ESVQQA1ASGITAQQIIHFLRTRAHPVMLKQTPVL 

PPrrTOQIRLWELERDRLRFTEGVLYNQFLSQVDF 

ELLVLAHAPKLGVLVFE/mPAKRLMVVTPAGHS 

DVKRFWKRQKHSS 


3498 


A 


790 


190 


RDLGPAALMTASASSFSSSQGVQQPSIYSFSQITR 
SLF; ±1 JGYAA Z>KJ.I L: <2 >STAIVii AS ?C2 C CRI 
IJlGO^YIKVPVTDAf^SKL i ;)FFDPIADL1HTVS 
MR(^RTLLNCMAG\M?viSASLCXAYLMKYHSM 
S\LU)AHTWA/TKSRRPURP^GFWEQLMreFK 
LF^^NNTVR^^INSFV r GNIPD^^.rIDLRMMISM 


3499 


A 


31 


1586 


TAGFLLAPLEMQRLLTPVKRILQLTRAVQETSLT 

PARLLPVAHQRFSTASAVPLAKTDTWPKDVGIL 

ALEVYFPAQYVIXJTDLEKYNNVEAGKYTVGLG 

QTRMGFCSVQEDINSLCLTWQRLMERIQLPWD 

SVGRLEVG'ltlilDKSKAVKTVLMELFQDSGNTO 

IEGIDTTNACYGGTASLFNAANWMESSSWDGRY 

AMWCGDIAVYPSGNARPTGGAGAVAMLIGPK 

APLALERGLRGTHMENVYDFYKPMASEYPIVD 

GKLSIQCYLRALDRCYTSYRKKIQNQWKQAGSD 

RPFTIi)DLQYMIFHTPFCKMVQKSIARLMFNDF 

I^ASSDTQTSLYKGLEAFGGLKLEDTYTOKDLD 

KALLKASQDMFDKKTKASLYLSTHNGNMYTSSL 

YGCLASLLSHHSAQELAGSRIGAFSYGSGLAASF 

FSFRVSQDAAPGSPLXDKLVSSTSDLPKRLASRKC 

VSPEEFIEIMNQREQFYHKVNFSPPGDTNSLFPGT 

WYLERVDEQHRRKYARRPV 


3500 


A 


185 


2692 


MLPTEVPQSHPGPSALLLLQLLLPPTSAFFPNIWS 
LIAAPGSITHQDLTEEAAUmi>QIJT^QPPP^ 
PLRLEDFLGRTLLADDLFAAYFGPGSSRRFRAAL 
GEVSRANAAQDFLPTSRNDPDLHFDAERLGQGR 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine C=Cysteioe, D=Aspartk Add, 
E^Glntamic Add, ^Phenylalanine, G=Glydne, H=Hbtidine, 
I-Isoleodne, K-Lyslne, l^*Leudne> M»Methionine» 
N=*Asparagine, P=Proline, Q-Glutamine, R-Argtnine, S=Serine, 
T^Thrconine, V«VaIine, W-Tryptophan, YHTyrosine, 
X«=Un known, *«Stop codon, /=possibIe nudebtide deletion, 
\=possible nodeotide insertion 










arlvgaijretwaaraijjhtlarqrlgaalha 

lqdfyshsnwvelgeqqphphllwprqelqnla 

qvadptcsix:eelscprnwlgftlltsgyfgthp 

pkppgkcshgghfdrsssqpprgginkdstspgfs 

phhmlhlqaaklallasiqafsllrsrlgdrdfs 

rllditpasslsfvldttgsmgeeinaakiqarhl 

veqrrgspmepvhyvlvpfhdpgfgpvfttsdpd 

sfwc^lneihalgggdepemclsalqijujlhtpp 

l^difvftdaspkdafltnqvesltqerrcrvtfl 

vtedtsrvqgrarredlsplrfepykavalasgg 

eviftkix^hirdvaaivgesmaalvtlpldppvv 

wgqplvfsvix3llqkitvrihgdissfwiknpag 

vsqgqeegggp1x3htrrfgqfwmvtmddppqt 

gtweiqvtaedtpgvrvqaqtsldflfhfgipme 

dgphpglypltqpvaglqtqllvevtglgsran 

pgdpqphfshve.rgvpegaelgqvplepvgppe 

rgllaaslsptllstprpfsleligqdaagrrlhr 

aapqpstvvpvllelsgpsgflapgskvplslria 

sfsgpqdldlrtfvnpsfsltsnlsrahlelnesa 

wgrlwlevpdsaapdsvvmvtvtaggreanpv 

ppthaflrllvsapapqdrh 


3501 


A 


1245 


5815 


RRAHPSHSRLSPYLSVSRDPYFFVTVSRTILTLSA 

PAPPRRTPAPSMGTALLQRGGCFLLCLSLLLLGC 

WAELGSGLEFPGAEGQWTRFPKWNACCESEMSF 

QLKTRSARGLVLYFDDEGFCDFLELILTRGGRLQ 

LSFSIFCAEPATLLADTPVNDGAWHSVRIRRQFR 

NTTLFnXJVEAKWVEVKSKRRDNfTVFSGLFVGG 

LPPELRAAALKLTLASVREREPFKGWIRDVRVNS 

SQVLPVDSGEVKLDDEPPNSGGGVSPCEAGEEGE 

GGVCLNGGVCSWDDQAVCDCSRTGFRGKDCS 

QEDNi^/HGT-\HLM?:*GDQGKT lA^CG'EYF 

CYDLSQNPlQSSSI^niSFKTL^ | 

ADYVNLALKNGAVSLVINLGSGAFEALVEPVNG 

KFM)NAWHDVKVTRNlJtQHSGIGHAMVTISVD 

GILTTTGYTQEDYTMLGSDDFFYVGGSPSTADLP 

GSPVSNNFMGCLKEVVYKNNDVRLELSRLAKQ 

GDPKMKfflGVVAFKCEhTVATLDPITFETPESFISL 

PKWNAKKTGSISFDFRTTEPNGLILFSHGKPRHQ 

KDAKHPQMIKVDFFAIEMLDGHLYLLLDMGSGT 

IKIKALLKKVNIMEWYHVDFQRDGR5GTISVNT 

LRTPYTAPGESEILDLDDELYLGGLPENKAGLVF 

PTEVWTALLNYGYVGCIRDLFIDGQSKDIRQMA 

EVQSTAGVKPSCSKETAKPCLSNPCKNNGMCRD 

GWNRYVCDCSGTGYLGRSCEREATVLSYDGSM 

FMKIQLPVVMHTEAEDVSLRFRSQRAYGILMAT 

TSRDSADTLRl^LDAGRVKLTVNLDCIRINCNSS 

KGPETLFAGYNLNDNEWHTVRVVRRGKSLKLT 

VDDQQAMTGQMAGDHTRLEFHNIETGIITERRY 

LSSVPSNFIGHLQSLTFNGMAYIDLCKNGDIDYC 

ELNARFGFRNIIADPVTFKTKSSYVALATLQAYT 

SMHLFFQFKTTSLDGLILYNSGDGNDFIVVELVK 

GYLHYVFDLGNGANLIKGSSNKPLNDNQWHNV 

MISRDTSNLHT\OODTKITTQITAGAR]^LKSDL 

YIGGVAKETYKSLPKLVHAKEGFQGCLASVDLN 

G\RLP\DL1SDGSFSCNGTDSRRGMWKGPSTT\CQ 
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SEQ1I) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=Gtatamic Add, ^Phenylalanine* G=Grydne, H=Histidlne, 
I=IsoIeudnt, K^Lysine, L^Leucine, M=MethtonIne T 
N=»Asparagine, PNProIine, Q=Glutamlne, R»Arginine, S«Serine, 
T=Threonlne, V-VaUne, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nudeotide insertion 










EDSCSNQGVCLQQWDGFSCDCSMTSFSGPLCND 

PGTTYIFSKGOGQITYKWPPNDRPSTRADRLAIGF 

STVQKEAVLVRVDSSSGLGDYLELHIHQGKIGVK 

FNVGTDDIAIEESNAnNIXjKYHVVRFTRSGGNA 

TLQVDSWPVIERYPAGRQLTIFNSQATinGGKEQ 

GQPFQGQLSGLYYNGLKVLNMAAENDANIAIVG 

NVRLVGEVPSSMTTESTATAMQSEMSTSIMETTT 

TLATSTARRGKPPTKEPISQTTDDILVASAECPSD 

DEDIDPCEPSSGGLANPTRAGGREPYPGSAEVIRE 

SSSTTGMWGIVAAAALCILILLYAMYKYRNRDE 

GSYHVDESRNYISNSAQSNGAVVKEKQPSSAKSS 

NKNKKNKDKEYYV 


3502 


A 


394 


72 


KPAHIJFIVnMPKRKPSEGAMSDKVKA/KFELQ 
RRSAGLFSKPTPPKPETRPKKDPANQRQKLPKVR 
KGKADA/SKEGNSPAEERCSMVQTQKVEGWRSG 
SELPVALSF 


3503 

1 

i . 


A 


43 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLS 

SLPPPPSRALAPTRAPDTALTEMEVAEVESPLNPS 

CKIMTFRPSMEEFREFNKYLAYMESKGAHRAGL 

AKVIPPKEWKPRQCYDDIDNLLIPAPIQQMVTGQ 

SGUnTQYMQKKAMTVKEFRQLANSGKYCTPRY 

LD YEDLERKYWKNLTF VAPIYG ADING SI YDEG V 

DEWN1ARLNTVLDVVEEECGISIEGVNTPYLYFG 

MWKTTFAWHTEDMDLYSINYLHFGEPKSWYAIP 

PEHGKRLERLAQGFFPSSSQGCDAFLRHKMTLIS 

PSVLKKYGIPFDKITQEAGEFMITFPYGYHAGFN 

HGFNCAESTOTATVRWIDYGKVAKLCTCRKDM 

VKISMDIFVRKFQPDRYQLWKQGKDIYTIDHTKP 

TPASTPEVKAWLQRRRKVRKASRSFQCARSTSK 

RPKA OEEEEVSDEVDGAEVPNPDSVTDDLKVSE 

ksea ^ : /ivirNT"Asrr~r - >^*mqv: 3; ;lz.:^ 

KLSGNSGLSTSVTEDliC SuD*^ YAYRSVPS1SSE 

ADDSIPLSTGYEKPEKSDP -RLSWPKSPESCSSVA 

ESNGVLTEGEESDVESHGNCLL1>GEIPAVPSGER 

NSFKVPSIAEGENKTSKSWRHPLSRPPARSPMTL 

VKQQAPSDEELPEVLSIEEEVEETESWAKPLIHL 

WQTKPPNFAAEQEYNA1VARMKPHCAICTLLMP 

YHKPDSSl^ENDARWETKIJDEVVTSEGKTKPLIP 

EMCFTYSEENIEYSPPNAFLEEDGTSLLISCAKCC 

VRVHASCY GIPSHEICDG WLC ARCKRN A WTAEC 

CIXNUtGGAIJKQTKNN^^ 

FWVTERTX3IDVGRIPLQRLKLKCIFCRHRVKRVS 

GACIQCSYGRCPASFHVTCAHAAGVLVMEPDDW 

PYVVNITGFRHKVNPNVKSKACEKVISVGQTVIT 

KHROTRYYSCRVMAVTSQTFYEVMFDDGSFSRD 

TrTEDIVSRDCLKLGPPAEGEVVQVKWPDGKLY 

GAKYFGSNIAHMYQVEFEDGSQIAMKREDIYTL 

DEELPKRVKARFVSAGRCHLGTCQVNSLSSPHVS 

QAQQETYLGFWINSKKSQCNIFLSGTY 


3504 


A 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRLLRAVHR 

SRAWTCYLAIRMLMATCCPSPTTTACTGPWQRA 

PPLRLLVQKREADSSGLAFASNSLQRRKKGLLLR 

PVAPLRTRPPLLISLPQDFRQVSSVIDVDLLPETH 

RRVRLHKHGSDRPLGFYIRDGMSVRVAPQG\LER 

VPGIFISRLVRGGLAESTGLLAVSDEILEVNGIEV 
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SEQfl) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=AJanlnc OCysteine, D=Aapartlc Add, 
E=€lutamfe Add, ^Phenylalanine, G=Glydne, B=Hbtidine, 
Wsoleudne, K^Lysine, L^Leucine, M=Methionlne, 
N=Asparagine, P=Proline, Q=Glutamlnc, R-Arginine, S=Seiine, 
T^Threonine, V^Valine, W»Tryptophan, Y^iyrosine, 
X=Unknown, *«Stop codon, ^possible nudeotide ddetion, 
V=possible nudeotide insertion 










AGKTLNQVTDMMVANSHN\LI\nrVKPANQRNN 
WRGASGRLTGPPSAGPGPAEPDSDDDSSDLVTE 
mQPPSSNGI^C^PPCWDLHPGCRHPGTRSSLPS 
LDDQEQASSGWGSRIRGDGSGFSL 


3505 


A 


3 

■ 


2898 

i. . 


SCRSATSQSGCGGGRSWLCSSLKMAAQPPRGIRL 

SALCTKFLHTNSTSHTWPFSAVAELIDNAYDPDV 

NAKQIWIDKTVIKDHICLTFTDNGNGMTSDKLH 

KMLSFGFSDKTVTMNGHVPVGLYGNGFKSGSMXR 

LGKDAIVFTKNGESMSVGLLSQTYLVEVIKAEHV 

VWIVAFNKHRQMINIAESKASLAAnJM 

QKLLAELDAnGKXGTRIIIWNLRSYKNATEFDFE 

KDKYDIRTPEDLDEITGKKGYKKQERMDQIAPES 

DYSIJEUYCSILYLKPRMQnLRGQKVKTQLVSKS 

1AYIERDVYRPKFLSKTVRITFGFNCRNKDHYGI 

MMYHRNRLIKAYEKVGCQ1JIANNMGVGVVGII 

ECNFLKFIHNKQDFDYTNEYRLT1TA 

YWNEMKVKKhni2YPLNLF\ 

CDACLKWRKLPIX}MDQLPEKWYCSNNP\DPQFR 

NCEWEEPEDEDLVHPTYEKTYKKTOKEKFRIRQ 

PEMIPRINAELLFRFRALSTPSVFSSPKESVSICR/RH 

LSEGTNSYATRLLNNHQVPPQSEPESNSLKRRLS 

TRSSILNAKNRRL\SSQFlENSVYKG\DDDDEDVII 

LEENSTPKPAVDHDIDMKSEQSHVEQGGVQVEF 

VGDSEPCGQTGSTSTSSSRCDQGNTAATQTEVPS 

L WKKEETVEDEIDVRNDA VILPSC VEAEAKIHE 

TQETTDKSADDAGCQLQELRNQLLLVTEEKENY 

KRQOHMFTOQIKVLQQRILEM^KYVKKETCH 

QSTETDAVFLLESINGKSESPDHMVSQYQQALEE 

IERLKKQCSALQHVKAECSQCSNNESKSEMDEM 

AVQLDDVFRQIJDKCSIERIXJYKSEVELLEMHKS 

QIEl30rm-KTEV^CL 7 ^IQQTATDVSTSi: ".. " _£ 

SVNKMIXJESLKUlSLRVi^^ 

QVNYDVDWDEILGQWEQMSEISST 


3506 


A 


2 


212CO 


RPPEAGGRYRAGGRRQAAKPSRPPLPSRRRLPQG 

GRTRRAMDRPAAAAAAGCEGGGGPNPGPAGGR 

RPPRAAGGATAGSRQPSVETLDSPTGSHVEWCK 

QLIAATISSQISGSVTSEKVSRDYKALRDGKKLA 

QMEEAPLFPGESIKAIVKDVMYICPFMGAVSGTL 

TVTDFKLYFKNVERDPHFILDVPLGVISRVEKIGA 

QSHGDNSCGIEIVCKDMRKLRLAYKAQEEQSKLG 

IFENLNKHArTLSNGQAI^AFSYKEKrTINGWKV 

YDPVSEYKRQGLFNESWKISKINSNYEFCDTYPA 

HVVPTSVKDDDLSKVAVFLAKGRVPVLSWIHPE 

SQATITRCSQPLVGPNDKRCKEDEKYLQTTMDAN 

AQSHKLHFDARQNSVADTNKTKGGGYESESAYP 

NAELVFLEIHMHVMRESLRKLKEIVYPSIDEARW 

I^NVIXjTHWLEYIRMIXAGAVRIADKIESGKTSV 

VVHCSDGWDRTAQLTSLAMLMLDSYYRTIKGFE 

TLVEKEWISFGHRFALRVGHGNDNHADADRSPIF 

IXJFVTXTVWQMTRQFPSAFEFNELrlJTILDHLYS 

QJGTr^CNCEC^RFKEDVYTKTISLWSYINSQL 

DEFSNPFFVNYENHVLYPVASI^HLELWVNYYV 

RWNPRMRPQMPIHQNLKELLAVRAELQKRVEG 

LQREVATRAVSSSSERGSSPSHFATSVHTLV 


3507 


A 


1 


2169 


GSSIKIRLTVLCAKNLAKKDFFRI^DPFVAKIVVD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add seqoence (A a Alantne 0=Cystdne, D»Aspartic Add, 
E=Glutamk Add, F=Phenylalanlne, G*=Glydne, IMBstidine, 
l^Isoteudne, K^Lysine, I/^Lendne, M^Methf onine, 
N=Asparagine, P^Pnrtint, Q-Glutamine, R»Arginlne, S=Serine, 
T=ThreonJne, V-Valine, W«Tryptophan, Y=Tyrosine, 
X=Un known, *«Stop codon, /^possible nudeotide ddetton, 
\=posable nudeotide insertion 










GSGQCHSTDTVKNTLDPKWNQHYDLYVGKTDSI 

TISVWNHKKIHrOCQGAGFL^CVRLLSNAlSRIXD 

TGYQRIJDLCKLNPSDTDAWGQIWSLQTRDRIG 

TGGSVVDCRGLLENEGTVYEDSGPGRPLSCFME 

EPAPYTDSTGAAAGGGNCRFVESPSQDQRLQAQ 

RLRNPDVRGSLQTPQNRPHGHQSPELPEGYEQRT 

WQGQVYFLHTQTGVSTWHDPRIPRDLNSVNCD 

ELGPIJ>PGWEVRSTVSGIUWVDHNNRriX3FTDP 

RLHHIMNHQCQLKEPSQPLPLPSEGSLEDEELPA 

QRYERDLVQKLKVLRHELSLQQPQAGHCRIEVS 

REEIFEESYRQIMKMRPKDLKKRLMVKFRGEEG 

LDYGGVAREWLYLLCHEMLNPYYGLFQYSTDNI 

YMLQINPDSSINPDHLSYFHFVGRIMGLAVFHGH 

YINGGFTVPFYKQLLGKPIQLSDLESVDPELHKSL 

VWILEhmiTP^DHTFCVEHNAFGmQHEUO'N 

G\RNWVTEENIGCEYVRLYVNWRFMRGIEAQFL 

AIXJKGFNELIPQHLLKPFDQKELELnGGIJDKIDL 

NDWKSNTRlJGICVADSNIVRWFWQAVETroEE 

RRARLLQFVTGSTRVPLQGFKALQGSTGVAAGPR 

LFTIHLIDANTDNLRKAHTCTNRIDIPPYESYEKL 

YEKLLTAVEETCGFAVE 


3508 


A 


3 


6388 


ILYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTEFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKXASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKKAGRNYGPPGNKKLIYFIDDMNMPEVD 

AYGTVQPHTIIRQHLDYGHWYDRSKI^IJKEITNV 

QVV7rMNI^/iG3rT^^RLQRHF3\TW-S^TOAD 

S< LSbl v r IILTQHLKLGNFP ASLQKSlPPLiDLAI , - \? 

H(;;XlAT17IJ*TGIKPrryiFNLRDFAl^ 

ECvlCS'TWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQ*rE\^KKTFDDmDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVUTEDAMRHVCHINRILESPRGNALLVGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFIJ4TDAQVADERFL 

VLINDllASGEIPDLYSDDEVENnSNVRNEVKSQ 

GLVDNRENCWKFF1DRIRRQLKVTLCFSPVGNKL 

RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 

LQNTEGIEPTVXQS1SKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLIJCLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQWGVETDKVSREKAMADEEEQ 

KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKThn^TELBCSFGSPPLAVSNVSAAVMVL 

MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLIN 

FNKJEN1HENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKLAAIKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDELLITAFISYLGFFT 

KKYRQSIJ^RTWRPYLSQLKTPIPVTPALDPLRM 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nndeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«AlanInc OQrstdne, D^Aspartic Add, 
D=GIntaiuic Add, F^PheirylalanJne, G=Grydne, H«HJstidine, 
I^lsoleadne, K^Lyslne, LRLeudne, M=Mcthionlne, 
N=Asparagine, ^Proline, Q=G!utamine, R^Arginine, S^Serine, 
T=Threonine, V«Valine, W^Tryptophau, Y-Tyrosine, 
X"=Unknown, *=Stop codon, ^possible nndeotide deletion, 
V=possJble nndeotide Insertion 










LMDDADVAAWQNEGLPADRMSVENATELJNCE 
RWPLMVDPQLQGDCWIKNKYGEDLRVTQIGQKG 
* YLQHEQALEAGAWHENLEESIDPVLGPLLGRE 
VIKKGRFIKIGDKECEYNPKFRLn 
PELQAQATLI^^^V^^XJLEDQUJ^AWSMERP 
DIJEQUCSDLTKQQNGFKITLKTLEDSLLSRLSSAS 
GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 
EVKINEAREHYRPAAARASLLYFIMNDI^KIr^ 
YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 
SITTSVYQYmGI^EC^KLTYLAQLTFQILLMNR 
EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 
VKVLSSMEEFSNLDRDEEGSAKSWKKFVESECPE 
KEKU^EWKhnCTALQrO.CMLRAMRPDRMTYAL 
RDFVEEKLGSKYWGRALDFATSFEESGPATPMF 
FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 
GQGQEWAEAALDLAAKKGHWVILQNTLEMCS 
RETEFKSILFALCYFHAWAERRKFGPQGWNRSY 
PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 
EIMYGGHTTDDWDRRLCRTYLGEFIRPEMLEGEL 
SLAPGFP1JPGNMDYNGYHQYIDAELPPESPYLYG 
LHr^AEIGFLTQTSEKLFRTVLELQPRDSQARDG 
AGATREEKVKALLEEILERVTDEFNIPELMAKVE 
ERTPYIWAFQECGRMN1LTREIQRSLRELELGLK 
GELTMTSHMENLQNALYFDMVPESWARRAYPS 
TAGLAAWFPDIXNRDCELEAWTGDFTMPSTVWL 
TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 
KfTKKhJREEFRSPPREGAYIHGLFMEGACWDTQA 
GIITEAKLKDLTPPMPVMFIKAIPAD\RQDCGHVY 
SCPVTKTSQXRDPTWWTTOLKTKENPSKWVLA 
GVALLLQI 




A 


3 


633P 


ILYINPA: ~ ^^rPFVSSWI^r^^TERAl^Tl!.^ 

DXYLPTCXDTLRTRFKKn.: ^PtiQSI^YQMVCHLLE 

CLLTTEDIPADCPKErYEHYFVFAAIWAFGGAMV 

QDQL VD YRAEFSK WWLTEFK rVJTPSQGTEFD Y 

YmPETKKFEPWSKLVPQr^DPmviPLQAC^ 

SETTRVCYFMERLMARQRPA^MLVGTAGTGKSVL 

VGAKlASIJDPEAYLVKNVPFhnfYTTSAMLQAVL 

EKPLEKKAGRNYGPPGNKKIIYrTDDMNMPEVD 

AYGTVQPHTIIRQHLDYGHWYDRSKLSIJCEITNV 

QYVSCMNPTAGSFTIM>RLQRHFSVFVLSFPGAD 

AI^SIYSIILTQHIJCLGNFPASLQKSIPr^^^ 

HQKIATTrXPTGIKFrmfTO^ 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

rTDKIQTEVLKKTFDDIEDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINR1LESPRGNALLVGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MD1ASLCIJKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENnSNVRNEVKSQ 

GLVDNRENCWKJT^RIRRQIJCVTLCFSPVGNKL 

RVRSRKPPAIVNCTAIHWFHEWPQQALESVSLRF 

IXJNTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTIPKSrXEFIRLYQSIXHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKhJEDADKLIQVVGVETDKVSREKAMADEEEQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^AIanlnc OCysteine, IMAspartic Add, 
E=GJntamic Add, ^Phenylalanine, OGlydne, ENBJstidlce, 
f=Isoleudne, K^Lyslne, L^Leudne, M=Methioui ne» 
N^Asparagine, P-Proline, Q-Glutamine, R**Arginine, S^Serine, 
T^Threonlnt, V-Valine, W»Tryptophan, Y-Tyrosine, 
X=lfaknown, *=Stop codon, /^possible nudeotide ddetion, 
V=possible nudeotide insertion 










KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELBCSFGSPPLAVSNVSAAVMVL 

MAPRGRWKDRSWKAAKVTMAKVDGFLDSLIN 

FNKENIHEN(XKAIRPYLQDPEPNPEFVATKSYA 

AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKLAAIKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDELLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 

LMDDADVAAWQNEGLPADRMSVENAHHNCE 

RWPLMVDPQLQGIKWDCNKYGEDLRVTQIGQKG 

YLQIEEQALEAGAWHENLEESIDPVLGPLLGRE 

VIKKGRFIKIGDKECEYNPKFRLILHTK^ 

PELQAQATLINFTVTRDGLEDQLLAAVVSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

G^GETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKINEAREHYRPAAARASLLYHMNDLSKIHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSWQYTIRGLFECDKLTY1JVQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRITMLRAMRPDRMTYAL 

RDFVEEKLGSKYWGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEWAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSDLFALCYFHA WAERRKFGPQG WNRS Y 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

A?ATTT!SKVI^LL33Tn FRVTDEFNPEL* * 4 JCVF. 

Kk'ilr i r i v /AFQECGRMNILTREIQRSLRELELUi ; ! 

GEJ, iWSHMENLQNALYFDMVPESWARRAYPS 

TAGjL^AWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GDTEAKXKDLTPPMPVMFIKAIPADVRQDCGHVY 

SCPVTKTSQ\RDITYVWTFNLKTKENPSKWVLA 

GVALLLQI 


3510 


A 


390 


3330 


AAGSGSRPPAPAARKMADLAECN1KVMCRFRPL 

l^SEVNRGDKYIAKJQGEDTVVIASKPYAFDRVF 

QSSTSQEQVYNDCAKKIVKDVLEGYNGTIFAYG 

QTSSGKTrnMEGKLHDPEGMGDPRIVQDIFNYIY 

SMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLSV 

HEDKNRVPYVKGCTERFV CSPDE VMDTEDEGKS 

NRHVA\nmfNEHSSRSHSIFLINVKQENTQTEQK 

LSGKLYLVDLAGSEKVSKTGAEGAVLDEAKNIN 

KSLSALGNVISALAEGSTYVPYRDSKMTRILQDS 

LGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTI 

KNTVCVNVELTAEQWKKKYEKEKEKNKILRKri 

QWI^NEU^WRNGETWIDEQFDKEKANLEAFT 

VDKDITLTNDKPATAIGVIGNFTDAERRKCEEEIA 

KLYKQIJDDKDEEINQQSQLVEKLKTQMLDQEEL 

LASTRRDQDNMQAELNRLQAENDASKEEVKEV 

LQALEELAVNYDQKSQEVEDKTKEYELLSDELN 
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SEQID 
NO: 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A" Ala nine OCysteine, D=Aspartic Add, 
EXJlatamic Add, ^Phenylalanine, G=G lycine, H^Histidine, 
I=Iso!eudne, K=Lysine, L=Leudne, M=Metblonlne, 
N^Asparagine, P»Protine, Q=<Jlutamine, R=Arginine, S=$erine, 
T=Tbreonine, V«Vallne, W=Tryptophan, Y=Tyrosine, 
X«Unknown, * c =Stop eodon, ^possible nudeotide deletion, 
Y=possibJe nudeotide insertion 










QKSATLASIDAELQKLKENTOIHQKKRAAEMMA 
SLLKDLAEIGIAVGNNDVKQPEGTGMIDEEFTVA 
RLY1SKMKSEVKTMVKRCXQLESTQTESNKKME 
ENEKEIj\ACQIJUSQHEAKIKSLTEYLQNVEQKK 

rqleesvdalseelvqlraqekvhemekehlnk 

vqtanevkqaveqqiqshrethqkqisslrdeve 

akaklitdlqdqnqkmileqerlrveheklka 

tdqeksrklheltvmqdrreqarqdlkgleetv 

akelqtlh^rioj^qdiatrvkksaeids\ddt 

ggsaaqkqkisflennle\qltksaqtswyrdna 

dlrcelpklekrlrataervkalesalkeaken 

asrdrkryqqevdrikeavrsknmarrghsaqi 

akpirpgqhpaaspthpsairgggafvqnsqpva 

vrggggkqv 


3511 

i 


A 


1 


1757 


masvqasrrqwcylcdlpkmpwamvwdfsea 
vcrgcvnfegadriellidaarqlkrshvlpegr 
stoppa1jchpatkdlaaaaaqgpqlpppqaqpqp 
sgtgggvsgqdrydratssgrlplpspaleytlg 
srlanglgreeavaegarrallgsmpglmppgl 

IAAAVSGUjSRGLTLAPGLSPARPLFGSDFEKEK 

QQRNADCLAELNEAMRGRAEEWHGRPKAVREQ 

IXALSACAPFNVRFKKDHGLVGRVFAFDATARP 

PGYEFELKLFTEYPCGSGNVYAGVLAVARQMFH 

DALREPGKALASSGFKYLEYERRHGSGEWRQLG 

ELLTDGVRSFREPAPAEALPQQYPEPAPAALCGP 

PPRAPSRNlj\PTPRRRKASPEPEGEAAGKMTTEE 

QQQRHWVAPGGPYSAETPGVPSPIAALKNVAEA 

LGHSPKDPGGGGGPVRAGGASPAASSTAQPPTQ 

HRLVARNGEAEVSPTAGAEAVSGGGSGTGATPG 

APLC\CTLCRERLEDTHFVQ\CPPVPEHKFCFPCSR 

;^iKAQGPAG^VYCPSGr : ; :?j -vg-^vftv ^ - 

GEIATiLAGI^ ^ViCKERD? ' 


3512 


A 


3 


1994 


NTNSSSVTNSAAGVEDLNIVQVTVPDNEKEk 1 . JS| 

IEKKQLREQVNDLFSRKFGEAIGVDFPVKVPYk ! 

KITFNPGCVVIIXjMPPGVVFKAPGYIJEISSMR^ 

EAAEFIKFTVIRPLPGLELSNGEYSTVGKRKIE)QE 

GRVFQEKWERAYFFVEVQNISTCLICKRSMSVSK 

EYNIJOIHYQTNHSKHYIXJYMERM^ 

KGUOCYLLGLSDTECPEQKQVFANPSPTQKSPVQ 

PVEDLAGNLWEKLREKIRSFVAYSIAIDEITDINN 

TTQLAIFIRGVDEMnDVSEEIJJDTVPMTGTKSGN 

EIFSRVEKSLKNFCINWSKLVSVASTGTPPMVDA 

NNGLVTKLKSRVATFCKGAELKSICCIIHPESLCA 

QVKLKMDimiDVVVKSVNWICSRGLNHSEFTTL 

LYBLDSQYGSLLYYTEDCWLSRGLVLKRFFESLB 

EIDSFMSSRGKPLPQI^SIDWIRDLAFLVDMTMH 

LNALNISLQGHSQIVTQMYDLIRAFLAKLCLWET 

HLTRNNIjUIFPT^ 

TEFQKRI^DFKLYESELTLFSSPFSTKIDSVHEELQ 
MEVmLQCNTVUCTKYDKVGIPEFYKYLWGSYP 
KYKHHCAKILSMFGSTYICEQLFSIMKLSKT1CYC 
SQLKDSQWDSVLHIAT 


3513 


A 


1836 


513 


FKSLLSVKWFCFSELVLIFLGTRCYWEMTQSRPSP 
DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 
GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A= Ala nine C=Cysteine, D=Aspartie Add, 
E=Glutamlc Add, ^^Phenylalanine, OGh/dne, H=Histidine, 
I=Iso)eudne» K=Lysfne, L=Leudne, MHtfethJonfae, 
N-Asparaglne, P»Pro!ine, Q=Glntamine, R»Arguiine, S=Serine, 
T-Threonine, V«Valine, W=Tryptophan, Y=Tyroslne, 
X=Unknown, # ==Stop codon, /=posslb!e nudeotide ddetion, 
\=possible nudeotide insertion 










LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

IAEGTSISEMWQNDLQPLLffiRYPGSPGSYAARQ 

HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNII 

STLNPTAKRHLVIACHYD^^ 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLBFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNrTPNSARWFERLQAffiHELHELGLLKDHSLEG 

RYFQNYSYGGV1QDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDOTENLDESTIDNLNKILQVFVLEYL 

HL 


3514 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 

DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 

GPGNPLPDRLGEMAGGRHRRVVGTLHLLLLVAA 

LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

IAEGTSISEMWQNDLQPLLBBRYPGSPGSYAARQ 

HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNn 

STLNrTAKRJHLVLACHYDSKYFSHW\NNRVF^ 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DI^LQLIFFIXjEEAFIJIWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFVLEYL 

HL 


3515 


A 


114 

* 


754 


LCRDLTTTMSSKRTKTKIXKRPQRATSNVFAMF 

DQSQIQErTCEAFNMIIXJNRDGFIDKEDLHDMLAS 

LGKNPTDEYLDAMMNEAPGPINFIM^ 

LNGTDPEDVIRNAFACFDEEATGTIQEDYLRELL 

TTVMGDR^TDE\EVDELYREAPI\DKKGOTFNYl\E 

FTTJ 7 * ETGGPiXKOnKXITFOIPSPNVPT lATF? 

Vrl>3EIFLLHGP 


.3516 


A 


1 


'5169 


MAAAPSALLLLPPFPVLSTYRLQSRSRPSAPETDD 

SRVGGIMRGEKNYYFRGAAGDHGSCmTSPLA 

SALLMPSEAVSSSWSESGGGLSGGDEEDTRLLQL 

LRTARDPSEAFQALQAALPRRGGRLGFPRRKEAL 

YRAIX5RVLVEGGSDEKRLCLQLLSDVLRGQGEA 

GQLEEAFSLALLPQLWSLREENPALRKDALQEL 

HICLKRSPGEVLRTLIQQGLESTDARLRASTALLL 

PILLTTEDLLLGLDLTEVIISLARKLGDQETEEESE 

TAFSALQQIGERLGQDRFQSYISRLPSALRRHYN 

RRLESQFGSQVPYYLELEASGFPEDPLPCAVTLS 

NSNLKFGIIPQELHSRI1JXJEDYKNRTQAVEELK 

QVLGKFWSSTPHSSLVGnSLLYNLLDDSNFKVV 

HGTI^VLHLXVIRLGEQVC^I^GPVIAASVKV^ 

DNKLVIKQEYMKIFLKLMKEVGIK^VLCLLLEH 

LKHKHSRVREEVVNICICSLLTYPSEDFDLPKLSF 

DLAPALVDSKRRVRQAALEAFAVLASSMGSGKT 

SILFKAVDTVELQDNGDGVMNAVQARLARKTLP 

RLTEQGFVEYAVLMPSSAGGRSNHLAHGADTD 

WLLAGNRTQSAHCHCGDHVRDSMHIYGSYSPTI 

CTRRVLSAGKGKNKLPWENEQPGIMGENQTSTS 

KD1EQFSTYDFIPSAKIJCLSQGMPVNDDLCFSRK 

RVSRNLFQNSRDFNPDCLPLCAAGTTGTHQTNLS 

GKCAQLGFSQICGKTGSVGSDLQFLGTTSSHQEK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCystdne, D=Aspartic Add, 
E^Glutaraic Add, Phenylalanine, OCIycine, H-Histidine, 
t*Isoleudne, K^Lyslne, I^Leudne, M=Methionine, 
N^Asparagizie, P s =ProHne, Q=GlutamJoe, R^Argmine, S=Serine, 
"^Threonine, V-Vallne, W-Tryptophan, Y^yroslne, 
X=Unknown, **=Stop codon, /^possible nucleotide deletion, 
V*possible nudeotide insertion 










VYASLNFGSKTQQTFGSQTECTSSNGQNPSPGAY 

EJSYPVSSPRTSPKHTSPLIISPKKSQDNSVNFSNS 

WPLKSFEGLSKPKSHRRSLSAQKSSVDPTGRXNHG 

\ENSQEKPP\VQLTPALVVRSPSSRRGLNGTKPYPPI 

P\RGISLLPDKADLSTVGHKKKEPDDIWKCEKDS 

LPIDLSELNFKDKDLDQEEMHSSLRSLRNSAAKK 

RAKLSGSTSDLESPDSAMKLDLTMDSPSLSSSPNI 

NSYSESGVYSQESLTSSLSTTPQGKRIMSDIFPTFG 

SKPCPTRLSSAKKKISHIAEQSPSAGSSSNPQQISS 

FDfTTTKALSEDSVVVVGKGVFGSLSSAPATCSQ 

SVISSVENGDTFSIKQSIEPPSGIYGRSVQQN1SSYL 

DVENEKDAKVSISKSTYNKMRQKKKEEKEIJ^ 

KIX3EKKEKNSWERMRHTGTCKMASESETPTGAI 

SQYKERMPSVTHSPEIMDLSELRPFSKPEIALTEA 

1JUJJU)EDWEKKIEGI^RCLAAFHSEILN^ 

HETNFAWQEVKNLRSGVSRAAWCLSDLFTYL 

KKSMDQELDTTVKVLLHKAGESNTFIREDVDKA 

LRAMVNNVTPARAVVSLINGGQRYYGRKMLFF 

MMCHPOTEKMLEKYVPSKDLPY1KDSVRNLQQK 

GLGEIPLDTPSAKGRRSHTGSVGNTRSSSVSRDA 

FNSAERA VTE VRE VTRKS VPRNSLES AE YLKLIT 

GLLNAKDFRDRINGIKQLLSDTENNQDLWGNIV 

KlFDAFKSRLmSNSKVNLVALETMHKMIPLLRD 

Hl^PIIhMLIPAIVDNNI^SKNPGIYAAATNVVQA 

LSQHVDNYL1XQPFCTKAQFLNGKAKQDMTEKL 

ADIVTELYQRKPHATEQKVLVVLWHLLGNMTN 

SGSLPGAGGNIRTATAKLSKALFAQMGQNLLNQ 

AASQPPHIKKSLEELLDMTILNEL 


3517 


A 


1449 


252 


QDLKP VLDREYLAIYLKM VFFTCN ACGES VKK1 
QVEKHVSVCRNCEC^SCIDCGKDFWGDDYKNH 

IQXISEL Ki^NVSPK 7RELLEQISAFD; : 7Kj uuv 

AKFQNWMKI^SIJ^VHNESELJXJVWNIFS 

PVNKEQDQRPlJroVANPHAEISTKVPASKV A 

VEQQGEVKKNKRERKEERQKKRKREKKELKLE 

NHQENSRNQKPKKRKKGQEADLEAGGEEVPEA 

NGSAGKRSKKKKQRKDSASEEEARVGAGKRKR 

RHSKVETDSKXKKMKLPEHPEGGEPEDDEAPAK 

GKFNWGTIKAILXQAPDNEITIKKLRK^ 

YTVTDEHHRSEEELLVIFNKKISK^^ 

VKLVK 


3518 


A 


3 


635 


APDSNARNDHFDACSLRVQAGLSSAGPALGNSG 

LAALMASPSKAVIVPGNGGGDVTrHGWYGWVK 

KELEKIPGFQCIAKNMPDPITARESIWLPFMETEL 

HCDEKTHIGHSSGAIAAMRYAETHRVYAIVLVSA 

YTSDLGDENERASGYFTRPWQWEKIKANCPYIV 

QFGSTDDPFLPWKEQQEVAD\SWKPNCTNSLTV 

ATFRTQSFMN 


3519 


A 


81 


2277 


VRETRREMAMAMSDSGASRLRRQLESGGFEARL 

YVKQLSQQSDGDRDLQEHRQRIQALAEETAQNL 

KRNVYQNYRQFIETAREISYLESEMYQLSHLLTE 

QKSSLESIPLTLLPAAAAAGAAAASGGEEGVGGA 

GGRDHLRGQAGFFSTPGGASRDGSGPGEEGKQR 

TLTTLLEKVEGCRHLLETPGQYLVYNGDLVEYD 

ADHMAQLQRVHGFLMNDCLLVATWLPQRRGM 



368 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/CS01/04098 



SEQID 
NO: 


Mctfaod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«AIanine OCystdne, D=*Aspartfe Add, 
E°€lutamic Add, ^Phenylalanine, GKJlycint, H=*Histfdlne, 
I»Isoteudne, K=Lysine, L^Leudne, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamint, R=Arginine, S^Serine, 
T^Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X«=4Jnknown, *«=-Stop cod on, /^possible nndeotide deletion, 
\ppossible nudeotlde insertion 










YRYNALYSIJDGI^VVNVKDNPPMKDMFKLLMF 

PENRIFQAENAKIKIIEWIJSVLEDTKRAI^EKRI^ 

EQEEAAAPRGPPQ VTSKATNPFEDDEEEEPA VPE 

VEEEKVDLSMEWIQELPEDLDVCIAQRDFEGAV 

DLLDKLNHYLEDKPSPPPVKELRAKVEERVRQL 

TEVLVFELSPDRSLRGGPKATRRAVSQLIRLGQC 

TKACELFLRNRAAAVHTAIRQLRIEGAIXLYIHK 

LCHVFFTSLLETAREFEIDFAGTDSGCYSAFVVW 

ARSAMGMFVDAFSKQVFDSKESLSTAAECVKVA 

KEHCQQLGDIGLDLTFnHALLVKDIQGALHSYK 

EHIEATKHRNSEEMWRRMNLMTPEALGKLKEE 

MKSCGVSNFEQYTGDDCWVNLSYTVVAFTKQT 

MGFLEEAIJCLYFPELHMVLLESLVEULVAVQHV 

DYSLRCEQDPEKKAFIRQNASFLYETVL\PWEK 

RFEEGVGKPAKQLQDLRNASRLIRVNPESTTSVV 


3520 


A 


1706 


540 


FVAHLAWPWRADGDMEIXjVLNEGFLVKRGHIV 

IINWKARWHLRQNTLVYYKLEGGRRVTPPKGRI 

LUXKnTTQ^LEYENRPLLIKLKTQTSTEYFLEA 

CSREE/RRDAWAFEMTGAIHAGQARGKVQQLHS 

I^NSFKIJPHISLHRIVDKMHDSNTGIRSSPNMEQ 

GSTYKKTrT.GSSLVDWLISNSFTASRLEAVTLAS 

MLMEENFLRPVGVRSMGAIRSGDLAEQFLDDST 

ALYTFAESYKKKISPKEEISLSTVELSGTVVKQGY 

IAKQGHKRKNWKVRRFVLRKDPAFLHYYDPSK 

EENRPVGGFSLRGSLVSALEDNGVPTGVKGNVQ 

GNLFKVTTKNDDTHY YIQ A\SSKAE\RAE\WIG SLS 

KSLNMNKDPEGTPDSLPSLPR 


3521 


A 


3 


3063 


HASVSLSLGCPRPCADTPGPQPQPMDLRVGQRPP 

VEPPPEPTLLALQRPQRLHHHLFLAGLQQQRSVE 

PMRVKMELPACGATLSJ..VPSLPAFSIPRHQSQSST 

^nJGC^C y^'iMDTr> IFO ~yn .t «Q3QELT*QJ 

LHBCDKSk.(isA vA^^VXHCQKLAEVILKKQQAALE 

RTVHFNSPUX : YRTLEPLETEGATRSMLSSFLPPV 

PSIJPSDPPEHFTIIJ^TVSEPM-KUIYKPKKSI^RR 

KNPIXRKESAPPSLPi*RPAETLGDSSPSSSSTPAS 

GCSSPNDSEHGPNPELXjSEALLGQRLRLQETSVAP 

FAIJ > TVSLIJ > AITLGLPAPARADSDRRTFIPTLGPR 

GPILGSPHTPLFLPHGLEPEAGGTLPSRLQPILLLD 

PSGSHAPLLTVPGIXjPI^FHFAQSLMTTERLSGSG 

LHWPLSRTRSEPLPPSATAPPPPGPMQPRLEQLKT 

HVQVTKRSAKPSEKPRLRQIPSAEDLETDGGGPG 

QWDDGLEHRELGHGQPEARGPAPLQQHPQVLL 

WEQQRLAGRLPRGSTGDTVLLPLAQGGHRPLSR 

AQSSPAAPASLSAPEPASQARVLSSSETPARTLPF 

TTGLIYDSVMLKHQCSCGDNSRHPEHAGRIQSIW 

SRLQERGLRSQCECLRGRKASLEELQSVHSERHV 

LLYGTNPI^RLKLDNGKLAGLLAQRMFVMLPCG 

GVGVDTDTIWNELHSSNAARWAAGSVTDLAFK 

VASRELKNGFAVVRPPGHHADHSTAMGFCFFNS 

VAIACRQLQQQSKASKILIVDWDVHHGNGTQQT 

FYQDPSVLYISLHRHDDGNFFPGSGAVDEVGAGS 

GEGFNVNVAWAGGLDPPMGDPEYLAAFRIWM 

PIAREFSPDLVLVSAGFDAAEGHPAPLGGYHVSA 

KCFGYMTQQLMN1AGGAWLALEGGHDLTAIC 

DASEACVAALLGNRVDPLSEEGWKQKPNLNAIR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide ( 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=»AIaniae OCystcine, B=Aspartic Add, 
E«G!utamic Add, ^Phenylalanine, G=Glydne, R=Histidine, 
I°Iso]eudne> K=Lysine, L=Leudne, M=*Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=VaIint, W^ryptophan, Y^Tyrosine, 
X=Unknown, *«Stop codon, /^possible nudeotide ddetion, 
\=possiWe nudeotide insertion 










SLEA\V1RVHSKYWGCMQRLASCPDSWVPRVPG 
ADKEEVEAVTALASLSVGILAEDRPSEQLVEEEE 
PMNL 


3522 


A 


9 


602 


KMAALGEPVRLERDICRAIELLEKLQRSGEVPPQ 
KLQALQRVLQSEFCNAVREVYEHVYETVDISSSP 
EVRANATAKATVAAFAASEGHSHPRWELPKTE 
EGLGFNIMGGKEQNSPIYISRIIP/GGIADRHGGLK 
RGDQLLSVNGVSVEGEHHEKAVELLKAAQGKV 
KLVVRYTPKVLEEMESRPEKMRSAKRRQQT 


3523 


A 


645 


1465 


IMAETSLLEAGASAASTAAALENLQVEASCSVCL 

EYLKEPVIIECGHNFCKACITRWWEDLERDFPCP 

VCRKTSRYRSLRPNRQLGSMVEIAKQL\RPSSGRS 

GMRASAPQHHEALSLFCYEDQEAVCLICAISHTH 

RAHTVVPLDDATQEYKEKLQKCLEAVLNQKLQEI 

TRCKSSEEKKPGELKRLVESRRQQILREFEELHRR 

LDEEQQVLLSRLEEEEQDDLQRLRENAAHLGDKR 

RDLAHLAAEVEGKCLQSGFEMLKVRPLPLHSPS 

G 


3524 


A 


3 


698 


PMVRHEAGEALGAIGDPEVLEILKQYSSDPVIEV 

AETCQLAVRRLEWLQQHGGEPAAGPYLSVDPAP 

PAEER\DVGRLREALLDESRPLFERYRAMFALRN 

AGGEEAALALAEGLHCGSALFRHEVGYVLGQLQ 

HEAAVPQLAAALARCTENPMVRHECAEALGAIA 

RPACLAALQAHADDPERWREXSCKVALDMYEH 

ETGRAFQYADGLEQLRGAPSLGPNPHPELPEDS 


3525 


A 


1452 

* 


694 


EGLQRPEYLVASAAGFQGLAWGGEGRGRAGCS 
SSGFRDAEPLLLSCPGRNEPLKKERLKWKSDYP 
MTDGQLRSKRDEF WDTAPAFEGRKEI WDALKA 
AAYAAEANDHELAQAILDGASITLPHGTLCECY 
DELG^YQLPIYCLSPPVNLLLEHTEEF5I,EPPEP 
??fYRREFF^C,?,I ZTTGKDYTLSASL} *•'. i . GCIK 
RQLHAQE/GTPKPS WQRV , SGKLLTLRTRLQET 
KIQKDFVIQVIINQPPPPQD 


3526 


A 


123 


SMI 


PG^GI^IAADHNEDLGHl^ADAPWPAVTMAP 

RKRSHHGLGFLCCFGGSDIPEINLRDNHPLQFME 

FSSPIPNAEELNIRFAELVDELDLTDKNREAMFAL 

PPEKKWQIYCSKKKEQEDPNKLATSWPDYYIDRI 

NSMAAMQSLYAFDEEETEMRNQWEDLKTALR 

TQPMRFVTRFIEIJBGLTCXLNFLRSMDHATCESRI 

HTSUGCIIALMNNSQGRAHVLAQPEAISTIAQSL 

RTENSKTKVAVLEILGAVCLVPGGHKKVLQAML 

HYQVYAAERTRFQTLLNEUDRSLGRYRDEVNLK 

TAIMSFINAVLNAGAGEDNLEFRLrlLRYEFLMLG 

IQPVIDKLRQHENAILDKHLDFFEMVRNEDDLEL 

ARRFDMVHIDTKSASQMFEL1HKKLKYTEAYPC 

LLSVLHHCLQMPYKRNGGYFQQWQLLDRILQQI 

VLQDERGVDPDIJ^LEOTNVK^JIVNM 

KQWRIXJAEKFRKEHMELVSRLERKERECETKTL 

EKEEMMRTVLNKMKDKLARESQELRQARGQVA 

ELVAQLSELSTGPVSSPPPPGGPLTLSSSMTTNDL 

PPPPPPLPFACCPPPPPPPLPPGGPPTPPGAPPCLG 

MGLPLPQDPYPSSDVPLRKKRVPQPSHPLKSFNW 

VKLNEERVPGTVWNEIDDMQVFRILDLEDFEKM 

FSAYQRHQELITNPSQQKELGSTBDIYLASRKVK 

ELSVIDGRRAQNCIILI^KLKI^NEEIRQAIIJKMD 
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SEQIB 
NO: 


Method 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
'sequence 


Amino acid sequence (A B AianraeC=€ysteine, D^Aspartic Add, 
BXIIutamic Add, ^Phenylalanine, OKSlyeine, H=Hbtidhje, 
^Isoleudne, K«=Lysine, L=Leudne, M=ftfethfonine, 
N^Asparagine, P*4>roUne, Q=Glutamine, R^Arglnine, S^Serine, 
T»ThreonJne, V-Valine, W«Tryptophan, Y«Tyrosine, 
X-Unknown, *=Stop eodon, A=possJblc nudeotide deletion, 
V=possibte nudeotide insertion 










EQEDLAKDMLEQLLKFlPEKSDroU^EHKHEIER 

MARADRFLYEMSRIDHYQQRIXJALFFKKKFQER 

LAEAKPKVEAIIJLASRELVRSKRLRQMLEVILAI 

GhlFMNKGQRGGAYGFRVASLNKIADTKSSIDRN 

ISLLHYLIMILEKHFPDn.NMPSELQHLPEAAKVN 

LAELEKEVGNLRRGLRAVEVELEYQRRQVREPS 

DKFVPVMSDFITVSSFSFSELEDQLNEARDKFAK 

ALMHFGEHDSKMQPDEFFGIFDTFLQAFSEARQD 

LEAMRRRKEEEERRARMEAMLKEQRERERWQR 

QRKVLAAGSSLEEGGEFDDLVSALRSGEVFDKD 

LCKLKRSRKRSGSQALEVTRERAINRLNY 


3527 


A 


1445 


714 


LLGTRMLAGQLEARDPKEGTHPEDPCPGAGAV 

MEKTAVAAEVLTEDCNTGEMPPLQQQIIRLHQE 

LGRQKSLWADVHGKLRSHIDALREQNMELREKL 

RALQLQRWKARKKSAASPHAGQESHTLALEPAF 

GKISPLSADEEUPKYAGHKNXQSGHSSWGQRSSS 

hWSAPPKPMSLKIERISSWKTPPQENRDKNLSRR 

RQDRRATPTGRPTPCAERRGWSEDGKVASDTCV 

TLHWPLGKFRFR 


3528 


A 


484 


1777 


RJSKIQVYYSTGYSSRKMNPTLGIAIFLAVLLTVK 

GLLKPSFSPRNYKALSEVQGWKQRMAAKELAR 

QNMDLGFKLIJCKlAE^yNPGRNIFLSPLSISTAFS 

MLCXGAQDSTLDEIKQGFNFRKMPEKDLHEGFH 

YIIHELTQKTQDLKLSIGNTLFIDQRLQPQRKFLE 

DAKNFYSAETILTKFQ^EMAQKQINDFI/ESKTH 

GKIhWLIEhnDPGTVMLLANYIFFRARWKH^ 

NVTKEEDFFLEKNSSVKWMMFRSGIYQVGYDD 

Kl^CmEIPYQKNITAIFILPDEGKLKHLEKGLQV 

DTFS R WKTLLSRRV VD VS VPRLHMTGTFDLKKT 

LSYIGVSKIFEEHGDLTKIAPHRSUCVGEAVNKA 

Fl^^rJjTEGAAOTO - 1 <?TLPME1?LWFK>KP 

Yl ; Li t oLi^TPSVLFlXjiOVNPIGK 


3529 


A 


1 


5684 


VSSV^HENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTT/HIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETIIQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYV(^FLTRLINLYnQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKBKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAVVIRPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKDKKIRMEAHAKFAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQ\nXJRHDIARVL£PLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQHTSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETIPMWSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

QVVFDL1CKWSGLEVESASVTSQLEBEAMPPKC 

SDIDPDEEITKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

IETKSRQRSHSSIQFSFKEKLSEKVSEKEITVKESG 

KQPGAKPKVKLARJOCDDDKKKSSNEKLKQTSV 

FFSIX5LDLENWYSCGEGDISEIESDMGSPGSRKSP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=Glntamlc Add, F*=Phenylalanine t (XJiydne, H^Hlstidlne, 
£=Isoleudne, K=Lydne, L^Lendne, M^M ethfonine, 
N-Asparagine, P^ProIine, Q=G!ntamine, K»Argf nine, S=Serine, 
T«Threonine, V-Vallne, W»Tryptophan, Y-Tyrosine, 
X^Unknown, *=Stop codon, /^possible nndeotide deletion, 
V*possibIe nndeotide insertion 










NFMHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 

TCOTAFVHAISTTC\WNAYTr^ 

SVMGKDFYSHIPVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTHVKVTAQDUGNIO^QMMSIEILTLL 

FTEIj\KVIESSAKGFPSnSDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLIV\LEHRVM\T 

IPEEVNBTGFDFVVS\DLEfflSPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRAVLHQHCACKMHPQWIGLIT 

STLPYMGKVLQRVWSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 

IXPPTTQYHQLLVSVDQKHIJEARSGII^IIJIMI 

MSSVTTJ^WSnJiQADSSEKMTIAASASLTTINLG 

ATKNLRQQILELLGPISMNHGVHFMAAIAFVWN 

ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 

RAETVIQTVKEVIJKQPPAIAKDKKHLSLEVCML 

QFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 

APGQFIJLGVLNEFIMKNPSLENK^ 

HKIVDA1GAIAGSSLEQTTWLRRNLEVKPSPKIM 

VIX}TNLESDVEDMLSPAMETANITPSVYSVHAL 

TLl^EVLAHLIJDMVFYSDEKERVIPLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLNfTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTMITELVQVFLLMEQELTADEDISRTSGPSVA 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

IAIJU^SENLPQFQMYRWAFIPEASDDSGLEVRR 

QGfflQREFKPYVVRIJUaXRKRAKKNPEEDNSG 

BT?X3WEPGiiIJ.LTICTVRS?.lEQLLP? " r T,?~V7 

NSKVTSRCGGHSGSP1LV : j ;\< e FPNKDMKLENHKP 

CSSKARQKIEEMVEKDFLEGMIKT 


3530 


A 


1 


.5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETIIQTPSWTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYHQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAWIRPPLTQGNIJ^YIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

S(^LTHKDKKIRMEAHAKFAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEDENFSLTVNPLSDRLSL 

LSTSSETIPMWSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

QWFDLICKWSGLEVESASVTSQLEIEAMPPKC 

SDIDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

IETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPGAKPKVKLARKKDDDKKKSSNEKLKQTSV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
Jocation 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^AIanlne OCysteine, D=>Aspartic Add, 
E^Glutamlc Add, ^Phenylalanine, G«Gtydne, BMHistidinc, 
JNIsolendne, K^Lysine, L=Leudne, M^Metfalonl ng, 
N=Asparagine, P=ProlIne, Q=GJutamine, R=Arginine, S=»Serine, 
T^Threonine, V«Valine ( W«Tryptophan, Y^Tyrosine, 
X-Unknown, *»=Stop codon, A=possible nucleotide deletion, 
\=possible nudeotide insertion 










FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 

NFNIHPLYQHV1XYLQLYDSSRTLYAFSAIKAILK 

TWIAFVNAISTTSVNNAYTPQLSLLQNLLARHRI 

SVMGKDFYSHffVDSNHNFRSSMYffilUSIXLYY 

MRSHYPTrlVkATAQDLIGNRNMQMMSIEILTLL 

FTELAKVIESSAKGFPSnSDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLUCVLQM.IV\LEHRVM\T 

IPEEXNETGFDFWSVDLEfflSPHQPMTSLQYLHAQ 

SrrCQGMFLCAVIRAVLHQHCACKMHPQWIGLrr 

STLPYMGKVLQRVWSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGITAIIHYC 

IJLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 

MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 

ATKNl^C^EJBLLGPISMNHGVOTMAAIAFVWN 

ERRQNKTTTOTKVIPAASEEQLLLVELVRSISVM 

RAETVIQTVKEVLKQPPAIAKDKKHLSLEVCML 

QFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 

APGQFLILGVLNEFIMKNPSLENKKDQRDLQDVT 

HKIVDAIGA1AGSSLEQTTWLRRN1JEVKPSPKIM 

VDGTOlJESDVEDMLSPAMETANITPSVySVHAL 

TlXSEVlAHLLDMVr^SDEKERVIPLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

A WKKEAFDLFMDPSFF QMD A S C VNHWRAIMDN 

LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFALFS SEIDQ YQKYLPDIQERLV 

ESLRLPQVPILHSQVFLFFRVLLLRMSPQHLTSL 

WPTMITELVQVFLLMEQELTADEDISRTSGPSVA 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

I AIAU^SENLPQFQMYRWAFIPEASDDSGLEVRR 

RTLGWEPGHLIXw-:;^ 

NSKVTSRCGGHSGSPn.YSNAFPNKDMKLENHKP 
CSSKARQKIEEMVEKjl;FI-FX3MIKT 


3531 


A 


553 


2470 


USPSPALSSQDPALSLKENLEDISGWGLPEARSK 

ESVSFKDVAVDFTQEEWGQLDSPQRALYRDVM 

LENYQN1XAU3PPLHKPDVISHLERGEEPWSMQ 

REVPRGPCPEWELKAVPSQQQGICKEEPAQEPIM 

ERPLGGAQAWGRQAGALQRSQAAP\GR\RTCHG 

IX3RP\V1^'PLRCPLFAQQRVPEGGFLLDTRKNV 

QATEGRTKAPARLCAGENASTPSEPEKFPQVRRQ 

RGAGAGEGEFVCGECGKAFRQSSSLTLHRRWHS 

REKAYKCDECGKAFTWSTOLLEHRRIHTGEKPFF 

CGECGKAFSCHSSLNVHQRMTGERPYKCSACEK 

AFSCSSLLSMHLRVHTGEKPYRCGECGKAFNQR 

THLTRHHRIHTGEKPYQCGSCGKAFTCHSSLTVH 

EKMSGDKPFKCSDCEKAFNSRSRLTLHQRTHTG 

EKPFKCADCGKGFSCHAYLLVHRRIHSGEKPFKC 

NECGKAFSSHAYLIVHRRIHTGEKPFT)CSQCWKA 

FSCHSSLIVHQRIHTGEKPYKCSECGRAFSQNHCL 

IKHQKmSGEKSFKCEKCGEMFNWSSHLTEHQRL 

HSEGKPLA1QFNKHLLSTYYVPGSLLGAGDAGLR 

DVDPEDALDVAKLLCWPPRAGRNFSLGSKPRN 


3532 


A 


3931 


317 


HRELQDSPSAEPPAGSMPLRHWGMARGSKPVGD 
GAQPMAAMGGLKVLLHWAGPGGGEPWVTFSE5 
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SEQH) 

NO: 


Method 


1 Predicted 
beginning 
nodeotide 
location 
corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted cod 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AJanine OCystdne, D=Aspartic Add, 
E^Glotamlc Add, ^Phenylalanine, G==Glyrine, H^HistidJne, 
l=Isoleudne, K=Lyslne, L=Lendne, M=Methionlne, 
N=Asparaglne, P^Proline, Q=Glutamlne, R«Argmine» S=Serine, 
T=Threonine, V=Vaiine» W«Tryptophan, Y^iyrosine, 
X=Unknown, *=Stop codon, A*possible nodeotide deletion, 
\=possible nudeotide insertion 










SLTAEEVCIHIAHKVGITPPCFNLFALFDAQAQV 

WlJTNHILEIPRDASmL^^ 

NPREPAVYRCGPPGTEASSDQTAQGMQLLDPAS 

FEYLFEQGKHEFVNDVASLWELSTEEEIHHFKNE 

SLGMAFLHLCHLALRHGIPLEEVAKKTSFKDCIP 

RSFRRHIRQHSALTRLRIJWVFRR^ 

WMVMVKYLATLERIJVPRFGTERWVCHLRLLA 

QAEGEPCY1RDSGVAPTDPGPESAAGPPTHEVLV 

TGTGGIQWWPVEEEVNKEEGSSGSSGRNPQASL 

FGKKAKAHKAFGQPADRPREPLGAYFCDFRDIT 

HVGLKEHCVSIHRQDNKCLELSLPSRAAALSFVS 

LVIXjYFW^TADSSHYIXHEVAPPRLVMSIRDGIH 

GPIJLEPWQAKLRPEIX5LYLfflWSTSHPYRLILTV 

AQRSQAPIXjMQSLRLRKFPIEQQDGAFVLEGWG 

RSFPSVREIXjAALC^CUJRAGDDCFSLRRCCLPQ 

PGETSNLDMRGARASPRTLNI^QI^rllRVDQKEI 

TQLSHIXjQGTRTNVYEGRIJIVEGSGDPEEGKMD 

DEDPLVPGRDRGQELRWLKVLDPSHHDIALAF 

YETASLMSQVSHTHLAFV^GVCVRGPENIMVTE 

YVEHGPLDVWLRRERGHWMAWKMVVAQQLA 

SALSYLENKNLVHGNVCGRNILLARLGLAEGTSP 

FOCLSDPGVGLGALSREERVERBPWLAPECLPGG 

ANSLSTAMDKWGFGATLLEICFDGEAPLQSRSPS 

EKEHFYQRQHRLPEPSCPQLATLTSQCLTYEPTQ 

RPSFRmRDLTRLQPHNLADVLTVNPDSPASDPT 

VFHKRYLKKIRDLGEGHFGKVSLYCYDPTNDGT 

GEMVAVKALKADCGPQHRSGWKQEIDILRTLYH 

EHIIKYKGCCEDQGEKSLQLVMEYVPLGSLRDYL 

PRHSIGLAQLLLFAQQICEGMAYLHAQHYIHRDL 

AARNVLLDNDRLVKTGDFGLAKAVPEGHEYYRV 

^EDGDSpVFWYAl" : "LK^VK *YY> ' v^r-QVT ! 

LYELi. i ;CDSSQSI ?TXFLELIGlAQ^,>Mi :LT 

EIJ^GERIJPRPDKCPCEVYHLMKNC^r,ivrEASF 

RPTFENLIPILKTVHEKYQGQAPSVFSVC 


3533 


A 1 


182 


3465 


reWLDFFRGSINSQFEFGRKKENMTSPAKFtw' DK 

EHAEYDTQVKEIRAQLTEQMKCLDQQCELRVQL 

LQDLQDFFRKKAEIEMDYSKNLEKLAERFLAKT 

RSTKIXJQFKKIX?NVLSPVNCWNLLLNQVKRES 

RDHTTLSDIYLNNIIPRFVQVSEDSGRLFKKSKEV 

GQQIXJDDLMKVU^LYSVMKTYHMYNADSISA 

QSKLKEAEKQEEKQIGKSVKQEDRQTPRSPDSTA 

NVRIEEKHVRRSSVKKIEKMKEKRQAKYTENKL 

KAIKARKEYLLAIJEATNASVFKYYIHDLSD1JDQ 

CCDLGYHASLNRALRTFLSAELNLEQSKHEGLD 

AIENAVENLDATSDKQRLMEMYNNVFCPPMKFE 

FQPHMGDMASQLCAQQPVQSELLQRCLQLQSRL 

STLKIENEEVKKTMEATLQTIQDIVTVEDFDVSD 

CFQYSNSMESVKSTVSETFMSKPSIAKRRANQQE 

TEQFmXMKEYLEGRNUTKLQAKHDLLQKTL 

GESQRTDCSLARRSSTVRKQDSSQAIPLWESCIR 

FISRHGLQHEGIFRVSGSQVEVNDIKNAFERGEDP 

LAGDQNDHDMDSIAGVLKLYFRGLEHPLFPKDEF 

HDIJ^ACVTMDNLQERAOIIRKVlXVLPKTTLn 

MRYLFAFLNHLSQFSEENMMDPYNLAICFGPSL 

MSVPEGHDQVSCQAHVNELIKTniQHENIFPSPRE 



374 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/DS01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanlne OCystdne, D«Aspartk Add, 
E-Glutamic Add, F=PhenyiaIanine, OGtyclne, H=Hbtidine, 
Msoleudne, K=Lyaine, L-Leucine, M«Methionine, 
N=>Asparagine, P=Proline, Q=Gintamine, R^Arginine, S=Serine, 
T°Threonine, V-Valine, W=Tryptopban, Y^Tyrosine, 
X^Unknown, *=Stop codon* /^possible nncleotide ddetion, 
\=possible nudeotide Insertion 










LEGPVYSRGGSMEDYCDSPHGETTSVEDSTQDV 

TAEHHTSDDECEPBEAIAKFDYVGRTARELSFKK 

GASLLLYQRASDDWWEGRHNGIDGLIPHQYIW 

QDTEDGVVERSSPKSEIEVISEPPEEKVTARAGAS 

CPSCKjHVADIYIANINKQRKRPES^ 

HGLSSSLTDSSSPGVGASCRPSSQPIMSQSLPKEG 

rT)KCSISGHGSI^SISRHSSLKNRLDSPQIRKTAT 

AGRSK5FDNHRPMDPEVTAQDIEATMNSALNELR 

ElJERQSSVHKJnPDVVLimEPLKTSPWAPTSEPS 

SPLfTTQLLKDPEPAFQRSASTAGDIACAFRPVKS 

VKMAAPVKPPAT\RPKPT\VFPKTNATSPGVNSST 

SPQSTDKSCTV 


3534 


A 


1 


2640 


FRRFVCPASRRPAAGLRDAASSAPRGMASEGPRE 

PESEGIKLSADVKPFVPRFAGLNVAWLESSEACV 

FPSSAATYYPFVQEPPVTEQKIYTEDMAFGASTFP 

PQYLSSEITLHPYAYSPYTLDSTQNVYSVPGSQY 

LYNQPSCYRGFQTVKHRNENTCPLPQEMKALFK 

KKTYDEKKTYDQQKFDSERADGTISSEIKSARGS 

HHl^IYAENSLKSDGYHKRTDRKSRIIAKNVSTS 

KPEFEFTTLDFPELQGAENNMSEIQKQPKWGPVH 

SVSTDISLLREWKPAAVLSKGEIVVKNNPNESV 

TANAATNSPSCTRELSWTPMGYWRQTLSTELS 

AAPKNVTSMINLKTIASSADPKNVSIPSSEALSSD 

PSYNKEKHIIHPTQKSKASQGSDLEQNEASRKNK 

KKKEKSTSKYEVLTVQEPPRIEDAEEFPNLAVAS 

ERRDRIETPKFQSKQQPQDNFKNNVKKSQLPVQL 

DLGGMLTALEKKQHSQHAKQSSKPVWSVGAV 

PVLSKECASGERGRRMSQMKTPHNPLDSSAPLM 

KKGKQREIPKAKKPTSLKXIILKERQERKQRLQE 

NAVSPAFTSDDTQDGESGGDDQFPEQAELSGPEG 

MDFt J;^VEDK25E?roTELQRDTEAm - FN 

mV?rrwZr:-^FRDYCSQMLSKEV^ 

LVRFOORMYQKDPVKAKTKRRLVIX5IJIEVLKH 

LKLKKiXCA^nSPNCEKIQSKGGlJDDTLHTIIDYA 

CEQNIPFVrALHiaCAI^ 

DGAQDQFHKMVELTVAARQAYKIMLENVQQE 
LVGEPVSLRHIJPAYPHRAPAALQKMAPQPATCEK 
EEPHYIEIWKKHLEAYSGCTLELEESLEASTSQM 
MNLNL 


3535 


A 


1747 


983 


LFQFQVCRSVLSPRAAGCTWSLAPRSRGAAGSPR 

RYRGPQPQPAPPSALPNSRPSPVASGREMWLSV 

PAEVTVIIXDmGTTIPIAF/KDIUTYffi 

LQTHWEEBECQQDVSLLRKQV\FADVVPAVRKW 

REAGMKVYIYSSGSVEAQKLLFGHSTEGDILELV 

DGHFDTKIGHKVESESYRKIADSIGCSTNNILFLT 

DVTREASAAEEADVHVAVWRPGNAGLTODEK 

TYYSLITSFSEL YLPS ST 


3536. 


A . 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTS 

IESRGRPAASAGLRRDRCALRRWPLRRAPLARAT 

RRRAGSPRRCAPRPRACPQGWSRARHQPGGLCL 

LLLLLCQFMEDRSAQAGNCWLRQAKNGRCQVL 

YKTELSKEECCSTGRLSTSWTEEDVNDNTLFKW 

MIFNGGAPNCIPCKETCENVDCGPGKKCRMNKK 

NKPRCVCAPDCSNTrWKGPVCGLDGKTYRNECA 

LLKARCKEQPELEVQYQGRCKKTCRDVFCPGSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AUnine C=Cys trine, D=Aspartk Add, 
E=€lutamic Add, ^Phenylalanine, OGrydne, H=HIstidine, 
I=Iso!endne, K=Lysine, L=Leudne, M^Methionlne, 
N=AsparagInt, P^ProUnt, Q=Glutamfne, R=Argfaine, S^Serine, 
T^Threonine, V^Vatine, W=Tryptophan, Y-Tyrosine, 
X«Vnknown, *=Stop eodon, Impossible nudeotide deletion, 
Y=posstbte nudeotide Insertion 










TCVXVDQTNNAYCVTCNRICPEPASSEQYLCGND 
GVTYSVSACHLRKATCLLGRSIGLAYEGKCIKAK 
SCBDIQCTGGKKCLWDFKVGRGRCSLCDELCPD 
SKSDEPVCASDNATYASECAMKEAACSSGVLLE 
VKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


3537 


A 


285 


2123 


IGLFLQVAPLSVMAKSCPSVCRCDAGFIYCNDRF 

LTSIPTGIPEDATTLYLQNNQINNAGIPSDLKNLL 

KVERIYLYHNSUDEFPT^PKYVKELHLQENNm 

TITYDSLSKIPYLEELHLDDNSVSAVSIEEGAFRD 

SNYLR1XFLSRNHLSTIPWGLPRTIEELRLDDNRIS 

TISSPSLQGLTSLKRLVLDGNLLNNHGLGDKVFF 

NLVNLTELSLVRNSLTAAPVNLPGTNLRKLYLQ 

DNHINRVPPNAFSYLRQLYRLDMSNNNLSNLPQ 

GIFDDUDNITQLILRNKPWYCGCKMKWVRDWL 

QSLPVKVNVRGLMCQAPEKVRGMAIKDLNAELF 

DCKDSGIVSTIQITTAPNTVYPAQGQWPAFVTK 

QPDIKNPKLTKDHQTTGSPSRKTITITVKSVTSDTI 

HISWKLALPMTALRI^WIJKLGHSPAFGSITETIVT 

GERSEYLVTALEPDSPYKVCMVPMETSNLYLFD 

EIPVCffiTETAPLRMYNPTTTLNREQEKEPYKNP 

NLPLAAIIGGAVALVTIALLALVCWYVHRNGSLF 

SRNCAYSKGRRRKDDYAEAGTKKDNSILEIRETS 

FQMLPISOTPISKEEFVIHTIrTPNGMNLYKNNH 


3538 

. • t 


A 


877 


6184 


WNVKPSLLVVQLFKFSDKEEHEQNDSISGKTGET 

GVEEMIATRKVEQDSKETVKLSHEDDHILEDAGS 

SDISSDAACIl^KTENSLVGLPSCVDEVTECNL 

ELKDTMGIADKTENTLERKKIEPLGYCEDAESNR 

QLESTEFNKSNLEWDTSTFGPESNILENAICDVP 

DQNSKQLNAIESTKEESHETAl^LQDDRNSQSSSV 

SYLESKSVKSKHTKP VIHSKQNMTTDAPKKIVA A 

KYEVIHSKTl^^ VK^^^rTD^r^^T^HRPV 

K\ v, "£KQIDKEFiHQSCNSGVKS V # NQATwVLKK 

TLQDQTLVQIFKPLTrlSLSDKSHArr GGLKEPHH 

PAQTGHVSHSSQKQCTKPQQQAPAMICmsHVX 

EELEHPGVEHFKEEDKIJKIJCK^^ 

KSFSU^EPPLFffDNIATIRREGSDHSSSFESKYMW 

TPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDC 

VGLSLSQAQQMGEEDKEYVCVKCCAEEDKKTEI 

IJDPDTLENQATVEFHSGDKTMECEKLGLSKHTT 

NDRTKYIDDTVKHKVKILKRESGEGRNSSTXRD 

NEIKKWQLAPLRKMGQPVI^RRSSEEKSEKIPKE 

STTVTCTGEKASKPGTHEKQEMKKKKVVEKGVL 

KVTHPAASASCTSADQIRQSVRHSLKDILMKRLTD 

SNLKWEEKAAKVATKIEKEU^FFRDTDAKYKN 

KYRSLMFNLKDPKNMIJFKKVLKGEVTO^ 

MSPEELASKELAAWRRRENRHTIEMIEKEQREVE 

RRPITKITHKGEIEIESDAPMKEQEAAMEIQEPAA 

NKSLEKPEGSEKVRKEEVDSMSKDTTSQHRQHLF 

DLNCKICIGRMAPPVDDLSPKKVKVWGVARKH 

SDNEAESIADALSSTSNILASEFFEEEKQESPKSTF 

SPAPRPEMPGTVEVESTFLARLNE ? IWKGFINMPS 

VAKFVTKAYPVSGSPEYLTEDLPDSIQVGGRISPQ 

TVWDYVEKIKASGTKEICVVRFIPVTEEDQISYT 

LLFAYFSSRKRYGVAANNMKQVKDN1YLIPLGAT 

DKIPHPLWFDGPGLEIJIRPNLLLGLnRQKLKRQ 
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S£QQ> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCysteine, I>»Aspartlc Add, 
&=GIutamlc Add, ^^Phenylalanine, G=dydne, H^Htetidine, 
I"*Isolendne, KHLysine, L^Leudne, M*=Methfonlne, 
N=Asparagine, P-Prollne, Q=Glutamine, RpArgfnine, S^Serine, 
Threonine, V=»Vallne, W»Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nudeotide deletion, 
V 3 possible nudeotide insertion 










HSACASTSHIAETPESAPPIALPPDKKSKffiVSTEE 

APEEEhTOFmSFTTVLHKQRNKPQQNl^ 

VEPLMEVTKQEPPKPLRF1JH3VLIGWENQPTTLE 

1ANKPU>VDDILQSUX}TTGQVYDQ\AQSV^ 

NTVKEIPFLNEQTNSKIEKTDNVEVTDGENKEIK 

VKVDN1SESTDKSAEIETSVVGSSSISAGSLTSLSL 

RGKPPDVSTEAFLTNLSIQSKQEETVBSKEKTLKR 

QLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGN 

VSCSENLVANTARSPQFINLKRDPRQAAGRSQPV 

TTSESKIXjDSCRNGEKHMLPGLSHNKEHLTEQIN 

VEEKLCSAEKNSCVQQSDNLKVAQNSPSVENIQT 

SQAEQAKPLQEDILMQNIETVHPFRRGSAVATSH 

FEVGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRP 

QQPNLQHLKSSPPGFPFPGPPNFPPQSMFGFPPHL 

PPPLU>PPGFG\FA\QNPMVPWPPVWILP\GQPQR 

MMGPLSQASRYIGPQNFYQVKDIRRPERRHSDP 

WGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKR 

ERHEKEWEQESERHRRRDRSQDKDRDRKSREEG 

HKDKERARLSHGDRGTDGKASRDSRNVDKKPD 

KPKSEDYEKDKEREKSKHREGEKDRDRYHKDR 

DHTDRTKSKR 


3539 


A 


157 


1769 


GSWTVEI^IJKJ^ASPSLKWVCLPGAAAVNKHRS 

GAGGLIRSUQCTWAPAGPARRGGRGIEDFPYLF 

FQLTHCQQRICSVTQAGVQWCDHSSLQPQTPGL 

NQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPP 

NVTWTELEDRDGRVYPHPQDLLAALPLALVLLA 

MRLAFERFIGLPLSRWLGVRDQTRRQVKPNATL 

EKHFLTEGHRPKEPQLSLLAAQCGLTLQQTQRW 

FRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGL 

SVLYHESWLWAPW^CWDRYPNQLTLSCPAADS 

SSIKPIu H*x)ir* )TA\DFJCEQ>OHHFVAVILMTF&Y 

SANLLRiX . 3LVLLLHDSSD YLLEACKMVNYMQY 

QQVCDAl^Ll-SFVFFYTRLVlJTT 

SNRGPrTGYYl'f^3LLMLIX}LLHVFWSCLEJ^ 

YSFMKKGQMEKDIRSDVEESDSSEEAAAAQEPL 

QLKNGTAGGPRPAPTDGPRSRVAGRLTNRHTTA 

T 


3540 


A 


267 


1397 


SPAGYCHSGLLPGCSRSA/CADLAKHQELPGKKL 

LSEKKLKRYFVDYRRVLVCGGNGGAGASCFHSE 

PRKEFGGPDGGDGGNGGHVDJIVDQQVKSLSSV 

LSRYQGFSGEDGGSKNCFGRSGAVLYTRVPVGTL 

VKEGGRWADLSCVGDEYIAALGGAGGKGNRF 

FlANNNRAPVTCTrXlQPGQQRVIJII^IJKTVAHA 

GMVGFPNAGKSSIJJIAISNARPAVASYPFTTLKP 

HVGIVHYEGHLQIAVADIPGIIRGAHQNRGLGSA 

FUIHIERCRFLLFVVDLSQPEPWTQVDDLKYELE 

MYEKGLSARPHAIVANKIDLPEAQANLSQLRJDH 

LGQEVTVLSALTGENLEQLLLHLKVLYDAYAEA 

ELGQGRQPLRW 


3541 


A 


1 


8008 


DTQVSETLKRFAGKVTTASVKERREILSELGKCV 

AGKDLPEGAVKGLCKLFCLTLHRYRDAASRRAL 

QAAIQQLAEAQPEATAKNLLHSLQSSGIGSKAGV 

PSKSSGSAALLALTWTCLLVRIVFPSRAKRQGDI 

WNKLVEVQCLLLLEVLGGSHKHAVIXjAVKKLT 
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SEQH) 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residoe of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«=A!anine OCystrine, D=Aspartic Add, 
E»Glutamic Add, ^Phenylalanine, G^Grydne, EMBGstidine, 
I-Isoleudne, KHLysine, L=Lcndne, M«Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R^Arginlne, $=$erine, 
T=ThreonJne ) V=Valine, W*Tryptophan, Y^roslne, 
X-Unknown, *=Stop codon, /possible nudeotide deletion, 
\=possfWe nudeotide insertion 










KLWKENTOLVEQYLSAILSLEPNQNYAGMLGLL 

VQFCTSHKEMDVVSQHKSALLDFYMKNILMSK 

VKPPKYLLDSCAPLLRYLSHSEFKDLILPTIQKSL 

LRSPEKVIETISSIXASVTLDLSQYAMDIVKGLAG 

HLKSNSPRLMDEAVLALRNLARQCSDSSAMESL 

TKHLFAILGGSEGKLTVVAQKMSVLSGIGSVSHH 

WSGPSSQVLNGIVAELFIPFLQQEVHEGTLVHA 

VSV1ALWCNRFTMEWKKLTEWFKKAFSLKTST 

SAVRHAYLQCMLASYRGDTLLQALDLLPLUQT 

VEKAASQSTQVPTITEGVAAALLLLKLSVADSQA 

EAKLSSFWQLIVDEKKQVFTSEKFLVMASEDAL 

CIVLH\LTER1JT^DHPHRLTGNKVQQYHRALVA 

VLLSRTWHVRRQAQQTWK1XSSLGGFKLAHGL 

LEELKTVLSSHKVLPLEALVTDAGEVTEAGKAY 

WPRVLQEAIXIVISGVPGLKGDVTDTEQLAQEM 

LHSrlHPSLVAVQSGLWPALLARMKIDPEAFITRH 

LIXJHPRMTTQSPLNQSSMNAMGSLSVLSPDRVL 

PQLISTTTASVQNPALRLVTREEFAIMQTPAGELY 

DKSnQSAQQDSIKKANMKRENKAYSFKEQIIELE 

LKEEKKKKGIKEEVQLTSKQKEMLQAQLDREA 

QVRRRLQELDGELEAALGLLDIILAKNPSGLTQYI 

PVLVDSFLPLLKSPLAAPRIKNPFLSLAACVMPSR 

LKAIX5TLVSHVTLRLLKPECVLDKSWCQEELSV 

AVKRAVMLLHnmTSRVGKGEPGAAPLSAPAFS 

LVFPFLKMVLTEMPHHSEEEEEWMAQELQILTVQ 

AQLRASPNTPPGRVDENGPELLPRVAMLRLLTW 

VIGTGSPRLQVLASDTLTTLCASSSGDDGCAFAE 

QEEVDVLLGALQSPCASVRETVLRGLMELHMVL 

PAPDTDEKNGLNLLRRLWWKFDKEEEIRKLAE 

RLWSMMGLDLQPDLCSLITDDVIYHEAAVRQAG 

AEALoQ. WARYO?,QAAE^* m *3T "OEKl VT 

ppfvld/jlokv \ ^ppdqvearcglalalk; v -..s | 

QYIJ)SSQVKPLFQFFVPDALNDRHPDVRKCML 7 | 

AAI^TUmiGKENVNSLLPVFEEFLK^APhfDAS 

YDAVRQSVVVLMGSLAKHLDKSDPKVKPIVAKL 

IAALSTPSQQVQESVASCLPPLVPAIKEDAGGMIQ 

RLMQQLLESDKYAERKGAAYGLAGLVKGLGILS 

LJCQQEMMAALTDAIQDKKNFRRREGALFAFEM 

IXITMLGKLFEPYVYHVLPHLLLCFGDGNQYVRE 

AADDCAKAVMSNI^AHGVKLVLPSLLAALEEES 

WRTKAGSVELLGAMAYCAPKQLSSCLPNIVPKL 

TEVLTDSHVKVQKAGQQAIJIQIGSVIRNPEIIjU 

APVIJJ)ALTDPSRKTQKCLQTLLDTKFVHFIDAP 

SLALIMPIVQRAFQDRSTDTRKMAAQIIGNMYSL 

TDQKDLAPYLPSVTrX}IJCASLlI)PVPEVRTVSAK 

ALGAMVKGMGESCFEDLLPWLMETLTYEQSSV 

DRSGAAQGLAEVMAGLGVEKLEKLMPEIVATAS 

KVDIAPHVRIX5Y]MMrT>rYLPITFGDKFTPYVGP 

PCILKALADENEFVRDTALRAGQRVISMYAETAI 

AIJXPQI^C^UTDDLWRIRFSSVQLLGDLLFHISG 

VTGKMTTETASEDDNFGTAQSNKAI1TALGVERR 

NRVLAGLYMGRSDTQLVVRQASLHVWKIWSN 

TPRTLREILPTLFGLLLGF1ASTCADKRT1AARTL 

GDLVRKLGEKILPEHPILEEGLRSQKSDERQGVCI 

GLSEMKSTSRDAVLYFSESLVPTARKALCDPLE 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, l>vVspartic Acid, 
E=C!ntamie Add, ^Phenylalanine, G^Gtydne, H^Histidine, 
I^Iaoleudnt, K=Lysine, L^Leudne, M=Methlonioe, 
N-Asparagine, P^Pr aline, Q^Glntamlne, R=ArgJniDe, S^Serine, 
T-Tbreonine, V=VaJine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nudeotide deletion, 
V-possible nudeotide insertion 










EVKEAAAKTFEQLHSTIGHQALEDE^FLLKQLD 

DEEVSEFAIJX3IJCQVMAIKSRVVIJ>YLVPKLTTP 

PVhTTRVLAFLSSVAGDALTRHLGVII^AVMLAL 

KEKLGTPDEQLEMANCQAVILSVEDDTGHRIIIE 

DLIJEATRSPEVGMRQAAAIILNIYCSRSKADYTS 

HLRSLVSGLIRLFNDSSPVVLEESWDALNAITKK 

U>AGNQLAIJEELHKEIRLIGNESKGEPIVPGFCLP 

KKGVTSILPVLREGVLTGSPEQKEEAAKALGLVI 

RLTSADAIJIPSWSITCP1JBULGDRFSWNVKAAL 

LETLSIXIAKVGIALKPFU^LQTTrTKALQDSNR 

GVRLKAADALGKLISIHIKVDPLFTELLNGIRAME 

DPGVRDTMLQALRFVIQGAGAKVDAVIRKNIVS 

LLLSMLGHDEDNTRISSAGCLGELCAFLTEEELS 

AVLQQOJjUJVSGIDWMVRHGRSLALSVAVNV 

APGRLCAGRYSSDVQEMmSSATADRIPIAVSGV 

RGMGFLMRHHIETGGGQLPAKLSSLFVKCLQNP 

SSDIRLVAEKMIWWANKDPLPPLDPQAIKPILKA 

LLD^^1CDK^^VVRAYSDQAIVNL1JCMRQGEEVF 

QSLSKI1J)VASLEVLNEVNRRSIJKKLASQADSTE 

QVDDULT 


3542 

i . • ■ . 


A 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAP 

GMPVGLMGSNGSPGQPGTPGSKGSKGEPGIQGMP 

GASGLKGEPGATGSPGEPGYMGLPGIQGKKGDK 

GNQGEKGIQGQKGENGRQGIPGQQGIQGHHGAK 

GERGEKGEPGVRGAIGSKGESGVDGLMGPAGPK 

GQPGDPGPQGPPGLEKjKPGREFSEQFIRQVCTDV 

1RAQLPVLLQSGRIRNCDHCLSQHGSPGPGPPGP1 

GPEGPRGLPGLPGRDGVPGLVGVPGRPGVRGLK 

GLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGI 

SKEGPPGDPGLPGKDGDHGKPGIQGQPGPPGICD 

F 5L r7?7IARRDFFRKCriTY 




A 


654 




PARSLEKMKASWLSLLGYL FSGA YILGRCTV 

AKiaHDGGXJ)YFERYSLENWClAYFESKFNPS\ 

AIYENTREGYTGFGLFQMRGSDWCGDHGKNRC 

HMSCSAIXNPNLEKTIKCAKTrVKGKEGMGAWP 

TWSRYCQYSDTLARWLJDGCKL 


3544 


A 


2 


1074 


SC^LAAGRI^QWIJLRASRSGMLRAGWLRGAAA 

LALLLAARWAAFEPITVGLAIGAASAITGYLSY 

NDIYCRFAECCREERPLNASALKLDLEEKLFGQH 

LATEVIVFKALTGFRNNKNPKKPLTLSLHGWAGT 

GKNFVSQMGAE^IJn'KGLKSNFVHIJ^STLHFP 

HEQKIKLYQDQLQKWmGNVSACANSVFTFDEM 

DKL\rn^IMAIOFU)YYEHVERVSYR\KAIFIFLS 

NAGGDUTKTALDFWRAGRKREDIQLKDLEPVL 

SVGVFNNKHSGLWHSGLIDKNLIDYFIPFLPLEYR 

HVKMCVRAEMRARGSAIDEDIVTRVAEENITFFPV 

RDEKJYSDKGCKTVQSRLDFH 


3545 


A 


3 


273 


SAQGRSWGRFYRQIKRHPGIIPMIGLICLGMGSA 

ALYLLRLALRSPDVW*SWDRKNNPEPWNRLSPN 

DQYKFLAVSTDYKKLKKDRPDF 


3546 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVVVLLWEAGAVPA 
PKVPDCMQVKHWPSEQDPEKAWGARVVEPPEK 
DDQLVVLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRLWVMPNHQVLLGPEEDQDHIYHPQ*GSR 



379 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A B Alanine OCysteine, D="Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=Glydne» H^Htettdioe, 
Msoleuclue, K»Lyslne, I/=Leudne, M=MetnJonlne, 
N«Asparaglne, P^roline, Q=Ghitamint, R^Arginine, S=Scrine, 
T=Threonlne, V-Valine, W-Tryptophan, Y^Tyrosine, 
Xattnknown, *=Stop codon, ^possible nudeotide deletion, 
V=po$sible nudeotide insertion 










GHHCPRPVPRPRLLGLGPSLPCPS 


3547 


A 


23 


591 


ALSTETOTPDMRRLLLVTSLVVVLLWEAGAVPA 
PKVPIKMQVKHWPSEQDPEKAWGARWEPPEK 
DDQLVVUTVQKPKIXTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRLWVMPNHQVIXGPEEDQDHIYHPQ*GSR 
GHHCPRPVPRPRLLGLGPSLPCPS 


3548 


A 


3 


1641 


TWLPSVPAEEVQQPEMAAVLNAERLEVSVDGLT 

LSPDPEERPGAEGAPLAAATAATALATWIRSRPG 

RLRGTARSPGRRAAGGAAEEARRLEQRWGFGLE 

ELYGlAlJUT^XEK^GKAFPlrTYEEKLKLVALffi 

QVLMGPYNPDTCPEVGFFDVLGNDRRREWAAL 

GNMSKEDAMVEFVKLimCCmFSTYVASHKIE 

KEEQEKKRKEEEERRRREEEERERLQKEEEKRRR 

EEEERLRREEEERRRDEEERLRLEQQKQQIMAAL 

NSQTAVQFQQYAAQQYPGNYEQQQILIRQLQEQ 

HYQQYMQQLYQVQLAQQQAALQKQQEVWAG 

SSLPTSSKVECNCTQVI*CQFNRQAKTH f IT>SSEKE 

LEPEAAEEALENGPKESLPVIAAPSMWTRPQIKD 

FKEKIQQDADSVriVGRGEVVTVRVPTHEEGSYL 

FWEFATDNYDIGFGVYreWTDSI^NTAVSVHVSE 

SSDDDEEEEEMGCEEKAKKNANKPLLDEIVPVY 

RRDCHEEVYAGSHQYPGRGVYLLKFDNSYSLW 

RSKSVYYRVYYTR 


3549 


A 


1837 


3593 


PAVLVLEPASQSRKQQNTASATAQHWSAQIHKE 

SFLAPVFTKDEQKHRRPYEFEVERDAKARGLEQF 

SATHGHTPIILNGWHGESAMDLSCSSEGSPGATS 

PFPVSASTPKIGAISSLQGALGMDLSGILQAGLIHP 

VTGQIVNGSLRRDDAATRRRRGRRKHVEGGMD 

LIFUCEQTLQAGn^VHEDPGQATLSriHPEGPGP 

ATSAKPAT. 1 . AliryjZKSIPaKSL^DV.-RQQAD 

SLEVPGFGAKl SD]^i:QRRPRCKEPGKLDVSSLS 

GEERWAIPKEP-j!JfcGFIJ>ENK^ 

GPRRRGRRPRSELLK APSIVADSPSGMGPLFMNG 

UAGMDLVGUJNMRNMPGIPLTGLVGFPAGFAT 

MPTGEEVKSTLSMLPMMLPGMAAVPQMFGVGG 

LLSPPMATTCTSTAPASLSSTTKSGTAVTEKTAE 

DKPSSHDVKTDTLAEDKPGPGPFSDQSEPAITTSS 

PVAFNPFLIPGVSPGLIYPSMFLSPGMGMALPAM 

QQARHSEIVGLESQKRKKKKTKGDNPNSHPEPA 

PSCEREPSGDENCAEPSAPLPAEREHGAQAGEGA 

LKDSNNDTO 


3550 


A 


287 


39 


QLNL>nOATSQKHRDFVAESVGEKPVGSLAGIGE 
VMDKKIJEEGCFDKAYYVLGQFLVLKKDEDLF*E 
WLRDTGGARTRGSRE 


3551 


A 


21 


3925 


GDLLEVGLPPGLEFPRGICLRGLRRTMSLDFGSV 

ALPVQNEDEEYDEEDYEREKELQQLLTDLPHDM 

LDDDLSSPELQYSDCSEDGTDGQPHHPEQLEMS 

WhffiQMJJ»KSQSVNGPSCQGLEPYNKVTYKPYQS 

SAQNNGSPAQEITGSDTFEGLQQQFLGANENSAE 

NMQHQLQVLNKAKERQLENLIEKLNESERQIRY 

LNHQLVIIKDEKDGLTLSLRESQKLFQNGKEREIQ 

LEAQIKAIJETQIQALKVNEEQMIKKSRTTEMALE 

SLKQQLVDLHHSESLQRAREQHESIVMGLTKKY 

EEQVLSLQKN1JDATVTALKEQEDICSRLKDHVK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»A!anine OCysteine, D=Aspartic Acid, 
E*=Glutamie Add, F^Phenylalanlne, G«Glydne, H^HbtidJae, 
I s IsoIendne, K=Lysinc, L=Lendne, M=Methionine, 
N^vVsparagine, ^Proline, Q=Glutamine l R s Arginlne, S^Serine, 
T= / Tbrtonine, V=Valine, W=Tryptophan, Y=Ty rosin e, 
X=Unknown, *^top codon, A=possible nndeotide ddetion, 
V*possible nndeotide insertion 










qlernqeaikl^kteiinkltrsleesqkqcahll 

qsgsvqevaqlqfqlc^aqkahamsanmnka 

lqeelteijcdeislyesaaklgihpsdsegelniel 

tesyvdlgikkvnwkkskvtsivqeedpneelsk 

defilklkaevqrixgsnsmkrhlvsqlqndlk 

ix2hk3ciedijiqvkkdeksievetktdtsekpknq 

lwpesstsdwrddilllkneiqvlqqqnqelke 

tegklrntnqdlcnqmrqmvqdfdhdkqeav 

drcertyqqhheamktqiresllakhalekqql 

feayerthlqlrseudklnkevtavqecylevc 

rekdnleltijrkttekeqqtqekikemqqlek 

ewqskldqtikamkkktlix:gsqtix)vttsdvi 

skkemaimieeqkcnqqnleqekdiaikgamkk 

iselelkhcentixqveiavqnahqrwlgelpe 

laeyqalvkaeqkkweeqhevsvnkrisfavse 

akekwkse1jenmrknilpgkeleekihslqkele 

LKNEEWWIRAEIjU<ARSEWNKEXQH 

QNEQDYRQFLDDHRNKINEYLAAAKEDFMKQK 

TEIXLQKETELQTCLDQSRREWTMQEAKRIQLEI 

YQYEEDILTVLGVLLSDTQKEHISDSEDKQLLEI 

MSTCSSKWMSVQYFEKLKGCIQKAFQDTLPLLV 

ENADPEWKKRNMAELSKDSASQGTGQGDPGPA 

AGHHAQPLALQATEAEADKKKVLEEKDLCCGHC 

FQELEKAKQECQDLKGKLEKCCRHLQHLERKHK 

AWEKJGEENNKVVEEUEENNDMKNKLEELQT 

LCKTPPRSLSAGAIENACLPCSGGALEELRGQYIK 

AVKKIKCDMLRYIQESKERAAEMVKAEVL*ERQ 

ETARKMRKYYLICLQQILQDDGKEGAEKKIMNA 

ASKLATMAKLLETPISSKSQSKTTQSGMSK 


3552 


A 


771 


375 


ARTRQTSGQAREPEKESPAPGGGGLAEIRSROOL 
SQToRJP?' AKDQAVE AMFPPARCyXFLLSFEI ■ ' v* 
MYFTREEWGHLNWGQKDLYR1 *■ * J.tLENYRK MV 
LLVYFQFDAAIPLC*TSLAHSSWLQLYFRLYF 


3553 


A 


76 


72 


PGVRGVEAPGGVAPGRNAMRRGERRDAGGPRP 

ESPVPAGRASLEEPPDGPSAGQATGPGEGRRSTB 

SEVYDIXjTNTFFWRAHTLTVLFILTCILGYVTLL 

EETPQDTAYNTKRGIVASILVFLCFGVTQAKDGP 

FSRPHPAYWRFWLCVSVVYELI1JFILFQTVQDG 

RQFLKYVDPKLGVPLPERDYGGNCLIYDPDNET 

DPFHNIWDKIJXjFVPAHFI^WYIXTLMIRDWW 

MCMnSVMFEFL£YSLEHQLPh^ECWWDHWIM 

DVLVCNGLGIYCGMKTLBWLSLKTYKWQGLWN 

IPTYKGKMKRIAFQFTPYSWVRFEWKPASSLRR 

WlAVCGIE-VFIIAELNTr^lja^WMPPEHYLV 

LIJ^VrTVNVGGVAMREIYDI ? MDDPKPHKKLGP 

QAWLVAATTATELLIVVKYDPHTLTLSLPFYISQC 

WTLGSVLALTWT\n^RFFLRDITLRYKETRWQK 

WQNKDDQGSTVGNGDQHPLGLDEDLLGPGVAE 

GEGAPTPN*PRGPAPRPLPSAPRAVCGASSRR 


3554 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPWNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

WSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRDLERIEDSTGLNRrXjPAPLSSRKHVLYVE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCystetne, D=Aspartic Acid, 
&=Glutamic Add, ^Phenylalanine, G^Glycine, HNHistidine, 
l=Isolendne> K=Lysine, L=Leudne, M=Methionlne, 
N^Asparagine, P=Proline, Q=Glntamine, R==Arginlnc, S=Serine, 
T=Threonine, V»Vallne, W^Tryptophan, Y»Tyrosine, 
X=Unknown, *«Stop codon, ^possible nudeotide deletion, 
V=possibIe nudeotide Insertion 










HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGl^MRLLESKKGLSFF 

AFEHSEEY(^AQHKI^VAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEY CKLELSLEPDEDPLCM 

OXIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLUQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVII^EIKEAVAALPPDVTTQSV 

MGFDPIJ>PSDTIYSYVRPERI^PISHGNTIALFFRS 

IJLJPTmMEGERPEEGVAGGLNRN(^LNRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3555 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPWNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKLRKKKKKQKKKXSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 

HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEY(^AQHKI^VAVESMEPhmiVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMEC AFHPLFS LTSG ACRLD YRRPENRSF YL AL Y 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASIIJOOALTMFPGVLLPLLESCSVRPT)ASVSSH 

RFFGPNV ^SO^?A-^QLY>7 "~ ~*TSHfI/V;>LL?A 

iMSWLiiENVHEVLQAVlT J.Our V EACENRRKV 

LYQRAPRNIHRHVn^EIKK^./AALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERliiPi^HGNTlALFFRS 

IXrWIMEGERPEEGVAGGLNRT^QOimLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3556 


A 


3388 


1650 


KTRGTMFYYP>m,QRHTGCFATIWLAATRGSRL 

VKREYLRVNVVKTCEEILNYVLVRVQPPQPGLP 

RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRTOMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITII^AEPIR 

MLEBEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

LPWPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQUHVKQEKPYGRLLIQPGPRFH 


3557 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATTWLAATRGSRL 
VKREYIJIVNVVKTCEEILNYVLVRVQPPQPGLP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Add, 
E=Glutamk Add, Phenylalanine, G=Grydne, H^HIstldine, 
I^Isolendne, K=Lysint, LHLeudne, ^Methionine, 
N^Asparagine, P^Proline, Q*G1utamine, R^Arginlne, S=Serine, 
T-Threonine, V-Valine, W«Tryptophan, Y«Tyrostne, 
X^Unknown, *=Stop codon, ^possible nudeotlde ddetfon, 
V=possible nudeotlde insertion 










!U>RFSLYLSAQUJIGmVYSQQCQYLVEDIQHIL 

EIU.HRAQLQIRIDMETEIJ > SLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIFVTV^ 

MLEEEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTOCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

tPVVPELPEWMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 


3558 


A 


489 


2360 


IRPRPRGRRRALDSFNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEIEDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKJSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALA VAEAMADKAELEKIX)LNGNTL G 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

^..ir/SSVTOEATVR/.- \- OD'VVT ^LMCr ~"7T 

SSFNSNfi wilCXVHMCXLKSEDKVKAL^U *L\ w " 

LMALNHMV(^DYFPKALAPLLLAFVTKPi>^ ALE 

SCSFARHSLLQTLYKV 


3559 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKE1EDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLTTAGAQLVELDLSDNAFGPDGVQGFE 

ALIJCSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPIALKVFVAGRNRI^NDGATAL 

AEArTlVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINL>nDNTFTEKGAVAMAETLKTLRQVE 

VINFGIXXVRSKGAVAIADAmGGIJKLKELNLS 

FCEIKJ^AALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQI^EVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

lAFPSPEKLIJU-GPKSSVLIAQQTDTSDPEKVVSA 

FLKVSSVFKDEATVRMAVQDAVDALMQKAFNS 

SSFNSNTFLTRLLVHMGIXKSEDKVKAIANLYGP 

LMALNHMVQQDYFPKAI^LLLAFVTKPNSALE 

SCSFARHSLLQTLYKV 


3560 


A 


2 


1198 


F VRELPRPRPGAATAAIMV S VINTVDTSHEDMIH ~ 
DAQMDYYGTRLATCSSDRSVKIFDVRNGGQILIA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AJanine C=Cys trine, D=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=Glyelne, H^HIstidJne, 
^Isoleudne, K~Lysine, Ir^Leudne, M=Methionlne, 
N»Asparagfne, P^Proline, Q=Gintamlne, R^Arginlne, S-Scrine, 
T^Threonine, V=VaIine, W=Tryptophan, Y^Tyrosinc, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nudeotide insertion 










DLRGHEGPVWQVAWAHPMYGNILASCSYDRKV 

IIWREENGTWEKSHEHAGHDSSVNSVCWAPHDY 

GLI1ACGSSDGAISLLTYTGEGQWBVKXINNAHT 

IGCNAVSWAPAWPGSLEDHPSGQKPNYIKRFAS 

GGCDNLIKLWKEEEDGQWKEEQKLEAHSDWVR 

DVAWAPSIGLPTSTLASCSQDGRVFIWTCDDASS 

KIWSPKLLHKFNDVVWHVSWSrTANILAVSGGD 

NKVTLWKESVDGQWVCISDVNKGQGSVSASVT 

EGQQNEQ*QDRWGLAPHPPAPGLPLPGPTNQTT 

GKSPQLQQDYFPRRSYRCSHRLnCLNVIGDAL 


3561 


A 


540 


86 


WRVKEMTSTLPKALGRKTASRSHTTLQGGSCCP 

VLWTAKLRCRKLRFPLPPPPPSSSAWPWQGWGI 

RGEQEAEGPLGETGPPVGPELSGLRQWRKLIKGR 

YGEWRGSGQKTGQPS*TTMQGGETEENRTBTTT 

GNKQRESEAPWVRHTYIT 


3562 


A 


1920 


242 


PMMAMPFFERFKSSIQRPSPVLVLSQNTKRESGR 

KVQSGNINAAKTIADIIRTCLGPKSMMKMLLDP 

MGGIVMTODGNAILREIQVQHPAAKSMIEISRTQ 

DEEVGDGTTSVIII^GEMI^AEHFI^QQMHPTV 

VISAYRKALDDMISTLKKISIPVDISDSDMMLNIIN 

SSITTKAISRWSSLACNIALDAVKMVQFEENGRK 

EIDIKK Y ARVEKIPGG IIEDSCVLRG VMINKD VTH 

PRMRRYDCNPRIVLLDSSLEYKKGESQTDIEITRE 

EDFTRILQMEEEYIQQLCEDIIQLKPDWITEKG1S 

DLAQHYLMRANITAIRRVRKTDNNRIARACGARI 

VSRPEELREDDVGTGAGLLEIKKIGDEYFTFITDC 

KDPKACTBLLRGASKEILSEVERNFQDAMQVCRN 

VLLDPQLVPGGGASEMAVAHALTEKSKAMTGV 

EQWPYRAVAQALEVIPRTLIQNCGASTLRLLTSLR 

AKHTQENCETWGVNGETGTLVDMKELGIWEPL 

AVKLQTYT^T\A^TAVlXIJ?IDT>r/SGK^KGDD » 

QSRQGGAPDAGQE 


3563 


A 


1571 


560 


GPSLLGTRGTPNPARTLQIFFLIIGRRLTGRMAAV 

DDl^FEEFGNAATSLTANPDATTVNIEDPGETPK 

KQPGSPRG SGREEDDELLGNDDSDKTELLAGQK 

KSSPFWTFEYYQTFFDVDTYQVFDRIKGSLLPIPG 

KNFVRLYIRSKPDLYGPFWICATLVFAIAISGNLS 

NFLIHLGEKTYHYWEFRKVSIAATHYAYAWLVP 

IjU,WGFLMWRNSKVMNIVSYSFLEIVCVYGYSL 

FIYIrTAILWIIPHKAVRWILVMIALGISGSLlJSJ^ 

FWPAVREDl^VAIATIVTIVI^^ 

YFFDAPEMDHLPTTTATPNQTVAAAKSS 


3564 


A 


1 


328 


NSRVDDFVAHLQRPLLGPASCLGILRPAMTAHSF 
ALPGnFTTFlVGLVGIAGPWFV^KGPNRGVIITML 
VATAVCCYLFWLIAILAQLOTLFGPQLKNETIWY 
VRFLWE 


3565 


A 


2 


1081 


FVTDFPARSMAATSLMSALAARLLQPAHSCSLRL 

RPFHLAAVRNEAWISGRKLAQQKQEVRQEVEE 

WVASGNKRPHLSVILVGENPASHSYVLNKTRAA 

AWGINSETIMKPASISEEELLNLINKLNNDDNVD 

GLLVQLPLPEfflDERRICNAVSPDKDVDGFHVIN 

VGRMCLIXJYSMLPATPWGVWEIIKRTGIFILGK 

mrvn/AGRSKKVGMPIAMLLHTDGAHERPGGDA 

TVTISHRYTPKEQLKKHTILADIVISAAGIPNLITA 

DMKEGAAVIDVGINRVHDPVTAKPKLVGDVDF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystdne, D=Aspartic Add, 
E=G)utemic Add, FNPbenylsJanine, G=Glydne> H-Hfctidioe, 
Hsoleudne, KpLysine, L=Leudne, M=Mctnionlne, 
N^Asparagtne, P^ProIlne, Q^Glutamlne, R=Arginine, S-Seruic, 
T=Threonine, V«Valine, \V~Tryptophan, Y^Tyrosinc, 
X^Uoknown, *=Stop codon, A=possible nndeotide ddetton, 
^possible nudeotidc insertion 










EGVRQKAGYITPVPGGVGPMTVAMLMKNTIIAA 
KKVLRLEEREVLKSKELGVATN 


3566 


A 


3 


1130 


SCRRGR(^QRRNVSLSSQFAHTMAAPAQQTTQP 

GGGKRKGKAQYVLAKRAKRCDAGGPRQLEPGL 

QGILITCKMNERKCVEEAYSLLNEYGDDMYGPE 

KFTDKDQQPSGSEGEDDDAEAALKKEVGDIKAS 

TEMRIJtRFQSVESGANNVVFIRTL^ 

LQDMYKTKKKKTRVILRMIJISGTCKAFLEDMK 

KYAETFLEPWFKAPNKGTFQIVYKSRNNSHVNR 

EEVIREI^GIVCTLNSENKVDLTNPQYTVVVEIIK 

AVCCLSVVKDYMLFRKYNLQEVVKSPKDPSQLN 

SKQGNGKEAKLESADKSDQNNTAEGKNNQQVP 

ENTEELGQTKPTSNPQVVNEGGAKPELASQATE 

GSKSNENDFS 


3567 

t 

I 


A 

• 


248 


3498 


GKKDSSPWTCPFHPPLQIJTVaRNTRQLGDFHLA 

KIKVRNYWTADGDIJ)IGAKNrVKLYV>^^ 

KLDKGDREAPADHSILVIXJKNEKSEQLEEAMNA 

HSEESKGTHEMAGASGDKELGLGCSPPAETLAD 

AKLSSQGNVSGKRKNSTNCRKDSLSQLEEYLRLS 

AVPTSMGDMPSAPATSPPVKCPPVHEEPSLIQQL 

ENLMGRKICEPPGKTPSWLQPSPTGKDRKQGGR 

KPKPLWLSPEKPLAWKGRLPSDDVIGEGPGETEA 

RDKGLRHEPGWGTSRSVNTKERPQRATTKVHSD 

DSDEFNQPPNRERPASGRRGSRKDAGSSSHGDDQ 

PASREDTWSSRTPSRSRWRSEQEHTLHESWSSLS 

AFDRSHRGRISNTELPGDILDELLQQKSSRHSDLP 

PSKKGEQPGLSRGQDGYSGETDAGGDFKEPVLPY 

GQRLVIDIKSTWGDRHYVGLNGIEIFSSKGEPVQI 

ShnKADPPDD^PAYGKDPRVVT^IDGVNRTQ 

DDMHVWLA' : *FTRGRSHSITIDFTHPCHVAL7RIW 

NYNKSRIffi GVKOr>4LL ( DTQ CT~GEIAIC_\SG * 

TaAGAPEHFGDTILFTTODDii \,Bkb Yij.CJEMFDLD 

VGSLDSLQDEEAMRRPSTADGBGDERPFTQAGL 

GADERIPELELPSSSPVPQVTTPE^CV TIGICLQLN 

FTASWGDLHYLGLTGLEVVGKEGQALrlHLHQIS 

ASPRDLNELPEYSDDSRTLDKUDGTNTIMEDEH 

MWUPFSPGLDHVVTIRLDRAESIAGLRFWNYNK 

SPEDIYRGAKTVHVSLDGLCVSPPEGFLERKGPG 

NCHFDFAQEILFVDYLRAQLLPQPARRLDMRSLE 

CASMDYEAPLMPCGFIFQFQLLTSWGDPYYIGLT 

GLELYDERGEKIPLSENNIAAFPDSVNSLEGVGG 

DVRTPDKLEDQVNDTSDGRHMWLAPILPGLVNR 

VYVnT)LPTTVSMIKLWOTAKTPHRGVKEFG^ 

VDDLLVYNGILAMVSHLVGGILPTCEPTVPYHTI 

LFTEDRDIRHQEKHTTISNQAEDQDVQMMNENQ 

IITNAKRKQSWDPALRPKTCISEKETRRRRC 


3568 


A 


50 


1724 


AQGGTLSAASRFCRGGLLGPWLHPASEMAATLD 

LKSKEEKDAELDKRIEALRRKNEALBRRYQEIEE 

DRKKAELEGVAVTAPRKGRSVEKENVAVESEKN 

LGPSRRSPGTPRPPGASKGGRTPPQQGGRAGMG 

RASRSWEGSPGEQPRGGGAGGRGRRGRGRGSPH 

LSGAGDTSISDRKSKEWEERRRQKIEKMNEEME 

KIAEYERNQREGVLEPNPVRNFLDDPRRRSGPLE 

ESERDRREESRRHGRNWGGPDFERVRCGLEHER 

QGRRAGLGSAGDMTLSMTGRERSEYLRWKQER 
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S£QU> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
locatioa 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=AJanine OCysteine, r>*Asparn*c Add, 
E"GIutamic Add, ^Phenylalanine, G^Gtydnc, H=HDstidine, 
I=Isolendne> K^Lyslne, L=Lendne, M=Methionlne, 
N^Asparagine, P^Proline, Q^Glutamine, R«Argjnine, S°Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y-Tyroslne, 
X-Unknown, *«Stop codon,/=possible nudeotide deletion, 
\=possible nudeotide insertion 










EKIDQERLQRHRKPTGQWRREWDAEKTDGMFK 

DGPVPAHEPSHRYDDQAWARPPKPPTFGEFLSQ 

HKAEASSRRRRKSSRPQAKAAPRAYSDHDDRWE 

TKEGAASPAPETPQPTSPETSPKETPMQPPEIPAP 

AHRPPEDEGEENEGEEDEEWEDISEDEEEEEIEVE 

EGDEEEPAQDHQAPEAAPTGIPCSEQAHGVPFSP 

EEPLLEPQAPGTPSSPFSPPSGHQPVSDWGEEVEL 

NSPRTTHLAGALSPGBAWFFESV 


3569 


A 


1 


912 


MGRVGRAGVQLGRRRri^WAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

(^DmAIFKDI^niSVRLVRDKDTOKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSKPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 


3570 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

QGDmAIFKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 


3571 


A 


28 


131 


RHFFGNLCAMRAKWRKKRMR^ 
RSK 


i 


A 

1 

■) 


t 


1202 


QaEPHPICVRVT^PVRDRr * l , li 'P71 1\ CRfi L Z . 

GQAEGSDGA*. ; t\i'3lRAMi JiQTGIHATEELl W 

AKARAGSVRLIKWIEDEQLVLGASQEPVGR^ / 

QDYDRAVLPLLDAQQPCYLLYRLDSQNAQGFE 

WU^WSPDNSPVRLKMLYAAT^ 

GHIKDELFGTVKDDLSFAGYQKHLSSCAAPAPLT 

SAERELQQIRI>7EVKTEISVESKHQTLQGLAFPLQ 

PEAQRALQQLKQKMVNYIQMKLDLERETIELVH 

TEPTDVAQLPSRVPRDAARYHFFLYKHTHEGDP 

LESWFIYSMPGYKCSIKERMLYSSCKSRLLDSV 

EQDFHLEIAKKIEIGDGAELTAEFLYDEVHPKQH 

AFKQAFAKPKGPGGKRGHKRLIRGPGENGDDS 


3573 


A 


49 


1869 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEV 

EEISLLQPQVEESVLNLGKFHSIVRLVAFCPFASS 

QVALENANAVSEGWHEDLRLIXETHLPSKKKK 

VLLGVGDPKIGAAIQEELGYNCQTGGVTAEILRG 

VRIilFHNLVKGLTOI^ACKAQLGLGHSYSRAKV 

KFNVNRVDNMIQSISLLIXJLDKDINTFSMRVRE 

WYGYHFPELVKHNDNATYCRLAQFIGNRRELNE 

DKIJEKLEELTMDGAKAKAILDASRSSMGMDISAI 

DLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAP 

SLSALIGEAVGARLIAHAGSLTNLAKYPASTVQIL 

GAEKALrTlALKTRGNTPKYGLIFHSTFIGRAAAK 

NKGRISRYLANKCSIASRIDCFSEVPTSVFGEKLR 

EQVEERLSFYETGEIPRKNLDVMKEAMVQAEAE 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Atanine OCystdne, D^Aspartic Add, 
E=Glutamic Acid, ^Phenylalanine, G=GIycine, EHSstidine, 
I=Isoleudne, KHLysine, D=Lendne, M^Methionine, 
N^Asparagine, P^Prollne, Q=GIutamine, R»Arginlnt, S-Seriue, 
■^Threonine, V-Valine, W«Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possib(e nudeotide insertion 










EAAAEITRKLEKQEKKRLKKEKKRLAALALASS 

ENSSSTPEECEETSEKPKKKKKQKPQEVPQENGM 

EDPSISFSKPKKKKSFSKEELMSSDLEETAGSTSIP 

KRKKSTPKEETVNDPEEAGHRSRSKKKRKFSKEE 

PVSSGPEEAVGKSSSKKKKKFHKASQED 


3574 


A 


284 


2032 


CGNERTARLWVQPVVSTMPQASEHRLGRTREPP 

VMQPRVGSKLPFAPRARSKERRNPASGPNPMLR 

PLPPRPGLPDERLKKLELGRGRTSGPRPRGPLRA 

DHGVPLPGSPPPTVALPLPSRTNLARSKSVSSGDL 

RPMGIALGGHRGTGELGAALSRLALRPEPPTLRR 

STSLRRLGGFPGPPTLFSIRTEPPASHGSFHMISAR 

SSEPFYSDDKMAHHTLIXGSGHVGLRMXjNTCF 

LNAVLQCLSSTRPLRDFCLRRDFRQEVPGGGRA 

QELTEAFADVIGALWHPDSCEAVNPTRFRAVFQ 

KYVPSFSGYS^DAQEFUOXMERLfflJBINRRGR 

RAPPI1ANGPVPSFPRRGGALLEEPELSDDDRANL 

MWKRYLEREDSKIVDLFVGQLKSCLKCQACGY 

RSTTTOWCDLSLPIPKKGFAGGKVSLRIX^FNLFT 

KEEELESENAPVCDRCRQKTRSTKKLTVQRFPRI 

LVLHLNRFSASRGSIKKSSVGVDFPLQRLSLGDF 

ASDKAGSPVYQLYALCNHSGSVHYGHYTALCR 

CQTGWHVYNDSRVSPVSENQVASSEGYVLFYQL 

MQEPPRCL 


3575 


A 


1 

• 


2408 


RELDSLADLPEWKPPYANGLSTSHLRSSSVEDVK 

LUSEGRPTffiVRRCSMPSVICEHTKQFQTISEESN 

QGSLLTVPGDTSPSPKPEVFSNVPERDLSNVSNIH 

SSFATSPTGASNSKYVSADRNLIKNTAPVNTVMD 

SPVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDF 

ICPNSN1PDQESSLQSFCNSENKVLKENADFLSLR 

QTELPGNSCAQDPASFMPPQQPCSFPSOST SDAES 

ISKT^JSLSYVANQEF'jILOOK>!AVQnS: . __L>TD I 

Nj^TKDTENTFVLG^VQICi r ^ VPVYSDSTIQEA 

SPNFEKAYTU>\a,PSEKDFNGSDASTQLNTHYAF 

SKLTYKSSSGHEVENSTTDTQVISHEKENKLESL 

VLTHLSRCDSDLCEMNAGMPKGNLNEQDPKHC 

PESEKCLLSIEDEESQQSILSSLENHSQQSTQPEM 

HKYGQLVKVET .FKNAEDDK1ENQIPQRMTRNK 

A>TIMANQSKQIIASCTLLSEKDSESSSPRGRIRLT 

EDDDPQIHHPRKRKVSRVPQPVQVSPSLLQAKEK 

TQQSLAATVDSLKLDEIQPYSSERANPYFEYLHIR 

KKEEEKRKLLCSVIPQAPQYYDEYVTFNGSYLLD 

GNPI^KICIPTITPPPSI^DPIJCElJ^l(^EVVR^^ 

RLQHSIEREKLIVSNEQEVLRVHYRAART1ANQT 

LPFSACTVLLDAEVYNVPLDSQSDDSKTSVRDRF 

NARQFMSWLQDVDDKFDKLKTCLLMRQQHEA 

AALNAVQRLEWQLKLQELDPATYKSISIYEIQEF 

YVPLVDVNDDFELTPI 


3576 


A 


5 


1421 


LRLAWHDGARWPLGTPRAAATRREAAALPPVT 

IAIXCLIX3VITLSSAENDFVHRIQEELDRFLLQKQ 

LSKVLLFPPLSSRLRYLIHRTAENFDLLSSFSVGE 

GWKRRTVICHQDIRVPSSDGLSGPCRAPASCPSR 

YHGPRPISNQGAAAVPRGARAGRWYRGRKPDQ 

PLYVPRVLRRQEEWGLTSTSVLKREAPAGRDPEE 

PGDVGAGDPNSDQGLPVLMTQGTEDLKGPGQR 

CENEPLLDPVGPEPLGPESQSGKGDMVEMATRF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A a Alanine OCysteine, D~Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=Giyclne, H=Histidine, 
Msolenelne, K~Lysine, LHLeudne, ftfcMetbtonlne, 
N=Asparagine, P*4>roline» Q=Glutamlne, R^Arginine, S'^Serine, 
T=Threonlne, V=»Valine, W=Tryptophan, Y=Tyrosine, 
X=Cnknown, * c $top eodon, ^possible nudeotide deletion, 
V^possfble nucleotide Insertion 










GSTLQLDLEKGKESLLEKRLVAEEEEDEEEVEED 
GPSSCSEDDYSELLQEITDNLTKKEIQIEKIH1JOTS 
SFMEELPGEKDLAHVVEIYDI^PAIJCTEDIXA'IP 
SEFQEKGFRIQWVDDTHAUjIFPCRASAAEALTR 
EFSVLKIRPLTQGTKQSKLKALQRPKLLRLVKER 
PQTNATVARRLVARALGLQHKKKERPAVRGPLP 
P 


3577 


A 


102 


1998 


DTRTPGSLEMGPLQFRDVAIEFSLEEWHCLDTAQ 

RNLYIWVMi^NYSNLVFLGIWSKPDLIAHIJEQG 

KKPLTMKRHEMVANPSGPVICSHFAQDLWPEQN 

DCDSFQKVILRRYEKRGHGNLQLIKRCESVDECK 

VHTGGYNGLNQCSTTTQSKVFQCDKYGKVFHK 

FSNShmHNIRHTEKKPFKCffiCGKAFNQFSTIJTH 

KKIHTGEKPYIC^CGKAFKYSSALNTHKRIHTG 

EKPYKCDKCDKAFIASSTLSKHEIIHTGKKPYKCE 

ECGKAFNQSSTLTKHKKIHTGEKPYKCEECGKAF 

NQSSTLTKHKKIHTGEKPYVCEECGKAFKYSRIL 

TTHKRIHTGEKPYKCNKCGKAFIA^ 

MGKKHYKCEECGKAFIWSSVLTRHKR\aCT 

YKCEECGKAFKYSSTLSSHKRSHTGEICPYKCEEC 

GKAFVASSTLSKHBIIHTGKKPYKCEECGKAFNQ 

SSSLTKHKKIHTGEKPYKCEECGKAFNQSSSLTK 

HKKIrTTGEKPYKCEECGKAFNQSSTLIKHKKI^ 

REKPYKCEECGKAFHLSTHLTIHKILHTGEKPYR 

CRECGKAFNHSATLSSHKKIHSGEKPYECDKCG 

KAFISPSSLSRHEHHTGEKP 


3578 


A 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 
LSRHPL S SGS PETS AAAIMLLTVRHGTVR YRSS A 
IJLARTKNN1QRYFGTNSVICSKJCDKQSVRTEETS 
KETSESQDSEKENTKKDLLGIIKGMKVELSTVNV 

B 7TKP? ~ 7 ^".SLEATLCKJL?ZL\T!]YAPKKR.E? 
LSPELV^t^SiAv.^DSLPFDKQTTBCSELLSQLQQil 
EEESRAQJ? DAKRPKISFSNHSDMKVARSATARV 

GKRLNIFDMMA\' rKEAPETDTSPSLWDVEFAKQ 
I^TVNEQPLQNGFEELIQWTKEGKLWEFPINNEA 
GFDDDGSEFHEHM^KHLESFPKQGPIRHFMELV 
TCGLSKOTYLSVKQKVEHIEWFRNYFNEKKDILK 
ESNIQFKLRPWKFLFRNN 


3579 


A 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LIJVRTKNNIQRYFGTNSVICSKKDKQSVRTEETS 

KETSESQDSEKEhTIKKDUXJIIKGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRDEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNESDMKVARSATARV 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 

GKRLNITOMMAVTKEAPETDTSPSLWDVEFAKQ 

IJVTVNEQPI^NGFEEUQWTKEGKLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 

TCGI^KNPYI^VKQKVEHmWFRNYFNEKKDILK 

ESNIQFKLRPWKFLFRNN 


3580 


A 


3673 


1619 


LYCVAPYSRfflXGRMSHI^MKLLRKKIEKRMJC 
LRQRNLKFQGASNLTLSETXJNGDVSEETMGSRK 
VKKSKQKPMNVGLSETXJNGGMSQEAVGNIKVT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nndeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nncleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystelne, D»Aspartic Add, 
E-=€Itrtamlc Add, ^Phenylalanine, G=€Jyeine, H=*Histidine, 
I"fcoleudne»Kp , Lysine, L^Leadne, M=Methionine, 
N=Asparagine, P=ProJine, Q=Glutamine, R^rginine, S=Serine, 
T-Threonine, V^Valine, W»Tryptophan, Y^Tyrosine, 
X~Un known, *=Stop codon, /=possible nndeotide deletion, 
V=possibIe nudeotide insertion 










KSPQKSTVLTNGEAAMQSSNSESKKKKKKKRK 

MVNDAEPDTKKAXTENKGKSEEESAETTKETEN 

NVEKPDNDEDESEVPSLPLGLTGAFEDTSFASLC 

NLVNENTLKAIKEMGFTNMTEIQHKSIRPLLEGR 

DIXAAAKTGSGKT!^FIJPAVELIVKLJ(FMPRNG 

TGVIJLSPTREIjVMQTFGVI^ 

MGGSNRSAEAQKLGNGINIIVATPGRLLDHMQN 

TPGFMYKNLQCLVIDEADRILDVGFEEELKQIIKL 

LPTRRQTMIJFSATQTRKVEDIARISLKKEPLYVG 

\0DDDKANATVDGLEQGYWCPSEKRFL1XFTFL 

KKNRKKKLMVFFSSCMSVKYHYELLNY1DLPVL 

AfflGKQKQNKRTTTFFQFCNADSGTLLCTDVAA 

RGLDIPEVDWIVQYDPPDDPKEYIHRVGRTARGL 

NGRGHAlilLRPEELGELRyLKQSKVPLSEFDFS 

WSKISDIQSQLEKL1EKNYFLHKSAQEAYKSYIRA 

YDSHSLKQIF>r\^NLNLPQVA3LSFGFKVPPFVDL 

NVNSNEGKQKKRGGGGGFGYQKTKKVEKSKIF 

KMSKKSSDSRQFSH 


3581 


A 


23 


453 


LCRCICIKNITPHCLWDKVLSQFTYILDNLSNFMS 

HHPHSLRNSCLniMDIXYWQFTIYTITFX^FSHLSG 

RLTLSAQHISHRPCLLSYSLLFWKVHHLFLEGFPC 

SPRLDEMSFHQFPQHPVHVSWHLPIVYKGSMT 

QVSPH 


3582 


A 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQEL VA SFSER VRNMSPDEIKIPPEPPGRC 

SNHLQDKJQKLYERKJKJEGMDMNYnQRKXEFRN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAJCAQKIEMDKLEKAKKERTKIEP.'TG^ 

KCTTTVATSTTiTT E L?7/.VAI>A0XRKSK.v. 'J-SAl 

PVI TTAQFULTTTATLPAW 1 '/TI3ASGSKTTVIS 

AVGTIVKKAKQ 


3583 


A 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQEL VASFSERVRNMSPDEIKIPPEPPGRC 

SKHLQDIQQKLYERKDCEGMDMNYIIQRKKEFRN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTTUQPmTTTATLJPAVVTVTT^ 

AVGTIVKKAKQ 


3584 


A 


3 


1139 


PGSTISSRADRLGAPVLAHPKMAERQEEQRGSPP 

IJIAEGKADAEVK1JLYHWTHSFSSQKVRLV1AE 

KALKCEEHDVSLPLSEHNEPWFMRLNSTGEVPV 

LIHGENIICEATQIIDYLEQTFLDERTPRLMPDKES 

MYYPRVQHYRELIJ3SLPMDAYTHGC1LHPELTV 

DSMIPAYATTOIRSQIGNTESELKKLAEENPDLQE 

AYIAKQKRLKSKLUDHDNVKYIJCKnLDELEK^ 

DQVETE1JPRRNEETPEEGQQPWLCGESFTLADVS 

IJVVTLHRLKFLGFARRNWGNGKRPNLETYYERV 

LKRKTOJKVIXJHVNN^ 

KVLGTTLWGLIAGVGYFAFN4LFRKRLGSMILA 

LRPRPNYF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nndeotidc 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=>Aspar1ic Add, 
E*=Gluiamk Add, ^Phenylalanine, (XJJydne, ff=Hi$tfdine, 
I=4soleudne, KpLysine, I>Leudne, M^Methionlne, 
N=Asparagint, PHfroUne, Q=Glutamlne, R-Arginine, S=Scrine, 
T=TbreonIne, V»Valine, W^Tryptophan, Y-Tyrosine, 
X=Unkoown, *«Stop codon, /^possible nadeotlde deletion, 
\=possible nucleotide insertion 


3585 


A 


1 


1777 


RRHSPGSPAFAPSSRATAICPRAARAPATLLLALG 

AVLWPAAGAWELmHTNDVHSRLEQTSEDSSK 

CVNASRCMGGVARUTKVQQIRRAEPNVLLLDA 

GEKJYQGTIWFTVYKGAEVAHFMNALRYDAMA 

LGNHEFDNGVEGLIEPLLKJEAKFPILSANIKAKGP 

LASQISGLYLPYKVLPVGDEWGIVGYTSKETPF 

LSWGT^VFEDEITAIXJPEVDKLKTLNVNKIIAL 

GHSGFEMDKLIAQKVRGVDVVVGGHSNTTLYT 

GNPPSKEVPAGKYPFIVTSDDGRKVPWQAYAF 

GKYLGYLKIEFDERGNVISSHGNPILLNSSIPEDPS 

KADINKWRIKLDNYSTQELGKTIVYLDGSSQSC 

RI^CNMG^ICDAMINNl^^ 

MCILNGGGIRSPIDERNNGTITWENLAAVLPFGG 

TTOLVQUCGSTLKKAFEHSVHRYGQSTGEFLQV 

GGIHVVYDLSRKPGDRVVKLDVLCTKCRVPSYD 

PIJCMDEVYKVILPNFIAKGGIXjFQMIKDELLRH 

DSGDQDINVVSTYISKMKVIYPAVEGRIKFSTGS 

HCHGSFSLIFLSLWAVIFVLYQ 


3586 


A 


1399 


881 


LSNKDVLSPQUCDENSKLRRKLNEVQSFSEAQTE 

MVRTLERKLEAKMIKEESDYHDLESVVQQVEQN 

LEU4TKRAVKAENHVVKLKQEISLLQAQVSNFQ 

RENEALRCGQGASLTWKQNADVALQNLRWM 

NSAQASffiQLVSGAETLNLVAEILKSIDRISEVKD 

EEEDS 


3587 


A 


88 


1639 


GCVGRGLPLPPRHPTPPSSSSSPFVLLAFLLLVRL 

DPAVSGKMAAPRPPPARLSGVMVPAPIQDLEAL 

RALTALFKEQRNRETAPRTIFQRVLDILKKSSHA 

VELACRDPSQVENLASSLQLITECFRCLRNAC1EC 

SVNQNSIRNLDTIGVAVDLILLFRELRVEQESLLT 

AFRCGLQFLGNIASRNEDSQSIVWVHAFPELFLS 

CL-y i D^KT'AYsrvr "rLNHERjviKf nr:xN 

lAIDVIDAYQKHFE^ ^^TDLFLKSPELVQA 

MFPKLNNQERVTLLL7 :vfIAKITSDEPLTKDDIPVF 

LRHAELIASTr^nDQCK'J\Ti:LASEEPPDDEEALA 

TIRIXDVIX^EMTVNTEIXGYLQVFPGLLERVIDL 

LRVIHVAGKFrTNIFSNCGCVRAEGDISNVANGF 

K5HLIRLIGNLCYKNKDNQDKYNELDGIPLILDN 

CMSDSNPFLTQWVIYAIRNLTEDNSQNQDLIAK 

MEEQGLADASIIJCKVGFEVEKKGEKLELKSTRD 

TPKP 


3588 


A 


3 


1462 


DSPRNRFEILGRPTRTPTRPGPRPAMEDLDALLSD 

LETTTSHMPRSGAPKERPAEPLTTPPSYGflQPQT 

GSGESSGASGDKDHLYSTVCKPRSPKPAAPAAPP 

FSSSSGVLGTGLCELDRLLQELNATQFNTTDEIMS 

QFPSSKVASGEQKEDQSEDKKRPSLPSSPSPGLPK 

ASATSATLELDRLMASLSDFRVQNHLPASGPTQP 

PWSSTNEGSPSPPEPTGKGSLDTMLGLLQSDLSR 

RGVPTQAKGLCGSCmPlAGQVVTALGRAWHPE 

HFVCGGCSTALGGSSFFEKDGAPFCPECYFERFSP 

RCGFCNQPIRHKMVTALGIHWHPEHFCCVSCGE 

PFGDEGFHEREGRPYCRRDFLQLFAPRCQGCQGP 

ILDNYISALSALWHPDCFVCRECFAPFSGGSFFEH 

EGRPLCENHFHARRGSLCATCGLPVTGRCVSAL 

GRRFHPDHFTCTFCLRPLTKGSFQERAGKPYCQP 

CFLKLFG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCysteiae, D=Aspartic Acid, 
E=Glutaraie Add, ^Phenylalanine, G=Gtydne, R-Histidinc, 
f=Isoleudne, K=Lyslne, L=Leudne, M^Methionine, 
N=Asparagine, P^Prolinc, Q=Glutaminc, R-Arginlne, S=S trine, 
^Threonine, V»Vaiine, W«Tryptopnan, Y^Tyrosine, 
X=Un known, *=Stop codon, /possible nudeotide ddetion, 
\=possible nudeotide insertion 


3589 


A 


226 


6793 


SPPKKSRKQ^^FRIJSAERWRFFIXILMEMPRKP 

RLTLJFVQRRIENIATEREFDPEEFYYLLEAAEGHA 

KEGQGIKTDIPRYIISQLGLNKDPLEEMAHLGNY 

DSGTAETPETDBSVSSSNASLKLRRKPRESDFETI 

KLISNGAYGAVYFVRHKESRQRFAMKKINKQNL 

ILRNQIQQAFVERDILTFAENPFWSIvr^CSFETOR 

HIX^MVMEYVEGGDCATLMKNMGPLPVDMARM 

YFAFIYLALEYLHNYGIVHM)LKPDNLLVTSMG 

HIKLTDFGLSKVGLMSMTTNLYEGHIEKDAREFL 

DKQVCGTPEYIAPEVDLRQGYGKPVDWWAMGII 

LYEFLVGCVPFFGDTPEELFGQV1SDEINWPEKDE 

APPPDA QDLITLLLRQNPLERLGTGGA YE VKQHR 

FFRSLDWNSLLRQKAEFIPQLESEDDTSYFDTRSE 

KYHHMETEEEDDTNDEDFNVEIRQFSSCSHRFSK 

VFSSIDRnXJNSAEEKEDSVDKlKSTTLPSTETLS 

WSSEYSEMQQLSTSNSSDTESNRHKLSSGLLPKL 

AISTEGEQDEAASCPGDPHEEPGKPALPPEECAQ 

EEPEVTTPASTISSSTLSVGSFSErlLDQINGRSECV 

DSTDNSSKPSSEPASHMARQRLESTEKKKISGKV 

TKSLSASALSLMIPGDMFAVSPLGSPMSPHSLSSD 

PSSSRDSSPSRDSSAASASPHQPrSOHSSGKNYGFT 

IRAIRVYVGDSDIYTVHHIVWNVEEGSPACQAGL 

KAGDLITHINGEPVHGLVHTEVIELLLKSGNKVSI 

TTTPFENTSIKTGPARRNSYKSRMVRRSKKSKKK 

ESLERRRSLFKKLAKQPSPLLHTSRSFSCLNRSLS 

SGESLPGSPTHSLSPRSPTPSYRSTPDFPSGTNSSQ 

SSSPSSSAPNSPAGSGFflRPSTLHGLAPKLGGQRY 

RSGRRKSAGNIPLSPLARTPSPTPQPTSPQRSPSPL 

LGHSLGNSKIAQAFPSKMHSPPTTVRHIVRPKSAE 

PPRSPLIJCRVQSEEKLSPSYGSDKKHLCSRKHSL 

EVTQFFVQREQS; ~"AP T QS VOEN' ?r T\' ~TLSRA 

RPVl:; "CLKRPVSRKVGRQESVDI ' :DRi^.KAK 

VWKKADGFPEKQESHQKFHGPGSDI "NFALFK 

IJEEREKKVYPKAVERSSTFENKASMQ£r^?LGSL 

UCDAIJHKQASVRASEGAMSIXjPVPAEHRQGGG 

dfrrapapgtlqdglchsldrgisgkgeg-eekss 

qakellrcekijdsklanidylrkkmsledkedn 

ix:pvlkpkmtagshecxpgnpvrptggqqepppa 

sesrafvssihaaqmsavsfvplkaltgrvdsgt 

ekpglvapespvrkspseyklegrsvsclepiegt 

ldiallsgpqasktelpspesaqspspsgdvrasv 

ppvu>sssgkkndttsarelspsslkmnksyllep 

wflppsrglqnspavslpdpefkrdrkgphptar 

spgtvmesnpqqregsspkhqdhttdpklltclg 

qnlhspdlarprcplppeaspsrekpglresserg 

pptarsersaaradtcrepsmelcfpetaktsdn 

sknllsvgrthpdfytqtqamekawapggktn 

hkdgpgearppprdnsslhsagipcekelgkvrr 

gvepkpeallarrslqppgbesekseklssfpslq 

kdgakeperkeqplqrhpssippppltakdlsspa 

arqhcsspshasgrepgakpstaepssspqdppkp 

vaahsessshkprpgpdpgppktkhpdrslssqk 

psvgatkgkepatqslggssregkghsksgpdvf 

patpgsqnkasdgigqgeggpsvplhtdraplda 

kpqptsggrplevlekpvhlprpghpgpsepadq 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

seqnenee 


Predicted end 
rnndeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteine, D=°Aspartic Add, 
&=Glntamic Add, ^Phenylalanine, OGlycine, H=Hlstidine, 
JNIsoleadne, K=Lysfne, L=Leudne, M=Meth!onine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S«Serine, 
T=Threonine, V-Valine, W^Tryptophan, Y»Tyrosine, 
X«Unknown, *=Stop codon, /"possible nndeotide deletion, 
\=possible nudeotide insertion 










KLSAVGEKQTI^PKHPKPSTVKDCPTLCKQTDN 

RQTDKSPSQPAANTDRRAEGKKCTEALYAPAEG 

DKLBAGLSFVHSENRLKGAERPAAGVGKGFPEA 

RGKGPGPQKPPTEADKPNGMKRSPSATGQSSFRS 

TALPEKSLSCSSSFPETRAGVREASAASSDTSSAK 

AAGGMLELPAPSNRDHRKAQPAGEGRTHMTKS 

DSLPSFRVSTLPLESHHPDPNTMGGASHRDRALS 

VTATVGETKGKDPAPAQPPPARKQNVGRDVTKP 

SPAPNTDRPISLSNEKDFVVRQRRGKESLRSSPHK 

KAL 


3590 


A 


3 


935 


RATTRPKNEVQDYVSVEYLSPHMGGTDPFKYSY 

PPLVDDDFQTPLCENGPITSEDETSSKEDIESDGK 

ETLETISNEEQTPLLKKINI^ESTSKAEENEKVDS 

KVKAFKKPLSVFKGPLLHISPAEELYFGSTESGEK 

KTLR^TNVTKNIV AFK WTTAPEKYRVKPSNS S 

CDPGASVDIWSPHGGLTVSAQDRFL1MAAEME 

QSSGTGPAELTQFWKEVPR^VMEHRLRCHTVE 

SSKPNTLTI.KDNAFNMSDKTSEDICLQLSRLLES 

NRKLEDQVQRCIWFQQIXI^LTMLLLAFVTSFFY 

LLYS 


3591 


A 


303 


2 


GGSWGPLCPVSPAMSLSDPGLGYHPTCWTLRWP 

PLC^LHALHVFHCLFSSRLGTPVSPRLAMDPNCS 

CEAGGSCACAGSCKCKKCKCTSCKKSCCSCCPL 


3592 


A 


1052 


1779 


GKTMMRKMLLAAALSVTAMTAHADYQCSVTP 

RDDVIVSPQTVQVKGENGNLVITPDGNVMYNGK 

QYSLNAAQREQAKDYQAELRSTLPWIDEGAKSR 

VEKARIALDKIIVQEMGESSKMRSRLTKLDAQVK 

EQMNRIIETRSIXjLTFHYKAJDQVRAEGQQLVNQ 

AMGGILQDSINEMGAKAVLKSGGNPLQNVLGSL 

GGLQSSIC^WKKOEKDFQQFGKDVCSRVVTLE 

DSSSivL-VCMJC 


~359^~ 


A 


3 


ib'37 


I^EKVDIQTDNDLTKEM'/EGK VSFELQRDFS 

QETDFSEASIXEK^EVHSAGNIKKEKSNTIDGT 

VKDETSPVEECrTSQSSNSY<^HTITGEQPSGCTG 

LGKSISrT)TKLVKHEIINSEERPFKCEELVEPFRCD 

SQLIQHQENNTEEKPYQCSECGKAFSINEKUWH 

QRLHSGEKPFKCVECGKSFSYSSHYTTHQTIHSGE 

KPYQCKMCGKAFSVNGSLSRHQRIHTGEKPYQC 

KECGNGFSCSSAYITHQRVHTGEKPYECNDCGK 

AFNGN AKLIQHQRIHTGEKP YECNECGKGFRCS S 

QIJ^QHQSIHTGEKPYQCKECGKGFN>n^TKUQH 

QRIHTASLAEQLFKASGNHPNWGCCLTISSPGPS 

VYGPKMNMRGAPNSRLAGGREKRTQDTDFGQC 

SFLPSHSPSCFEPWNVTDYDSSWYRQKQVLSGV 

WSSPLSILKLPRTLIRISIfflQEMDTPGEMLMTGR 

GSLGPTLTTEAPAAAQPGKQGPPGTGRCLQAPGT 

EPGEQTPEGARELSPLQESSSPGGVKAEEEQRAG 

AEPGTRPSLARSDDNDHEVGALGLQQGKSPGAG 

NPEPEQDCAARAPVRAEAVRRMPPGAEAGSVVL 

DD 


3594 


A 


39 


261 


RAAMMDTSRVQPIKLAIVIKVLGRTGSQGQCTQ 

VRVEFMDDTSRSIIRSVKGPVREGDVLTLLESERE 

ARRLR 


3595 


A 


973 


68 


GRVGTKHQMADDAGAAGGPGGPGGPGMGHRG 
GFRGGFGSGIRGRGRGRGRGRGRGRGARGGKAE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysidne, D=?Aspartic Add, 
E^Glotamic Add, ^Phenylalanine, G=Glycine, H=Histidlne, 
I=Isoleuclne» K=Lysme, L=Leotine, M=Methionlne, 
N^Asparagine, P^Prollne, Q=Glutamine, R«Argtnine» S=Serine, 
T-Threonine, V»VaIiBe, W»Tryptophan, Y«Tyrosine, 
X^Unknown, *=^top cod on, /=possible nucleotide deletion, 
V=possibie nucleotide insertion 










DKEWMPVTKLGRLVKDMKIKSLEEIYLFSLPIKE 

SEIIDFFIXjASLKDEVLKIMPVQKQTRAGQRTRF 

KAFVAIGDYNGHVGLGVKCSKEVATAIRGAIILA 

KI^IWVRRGYWGNKIGKPHTVPCKVTGRCGSV 

LVRLIPAPRGTGIVSAPWKKLLMMAGIDDCYTS 

ARGCTATLGNFAKATFDAISKTYSYLTPDLWKE 

TVFTKSPYQEFTOHLVKTHTRVSVQRTQAPAVA 

TT 


3596 


A 


106 


2960 


DERRVGAADMFGRSRSWVGGGHGKTSRNIHSL 

DHLKYLYHVLTKNTTVTEQNRNLLVETIR^ 

IWGIXJNDSSVFDFFLEKNMFVFFLNILRQKSGRY 

VCVQLLQTLNIIJ^ISHETSLYYlXSN>rV r VNSn 

VHKFDFSDEEIMAYYISFIJCTL^^^ 

YNEHT^FALYTEAlKFFhmPESMVRIAVRTI^ 

NVYKVSLDNQAMLHYIRDKTAVPYFSNLVWFIG 

SHVIELDIXJVQTDEEHRNRGKLSDLVAEHLDHL 

HYL>nDIUINCEFLNDVLTDHLLNRLFLPLYVYSL 

ENQDKGGERPKISLPVSLYLLSQVFLIIHHAPLVN 

SIAEVD^GDI^EMYAKTEQDIQRSSAKPSIRCFI 

KPTETLERSLEMhflaiKGKR^ 

EDEEKGPTEDAQEDAEKAKGTEGGSKGKTSGES 

EEIEMVIMERSKLSELAASTSVQEQNTTDEEKSA 

AATCSESTQWSRPFLDMVYHALDSPDDDYHALF 

VLCLLYAMSHNKGMDPEKLERIQLPVPNAAEKT 

TYNHPLAERLIRIMNNAAQPDGKIRLATLELSCL 

LIXQQVLMSAGCIMKDVHLACLEGAREESVHLV 

RHFYKGEDIFLDMFEDEYRSMTMKPMNVEYLM 

MDASILLPPTGTPLTGIDFVKRLPCGDVEKTRRAI 

RVFFMLRSLSLQLRGEPETQLPLTREEDLIKTDDV 

LDLNNSDUACTVITK1XK3MVQRSLAVDIYQMS 

LVEriwsRLG * _ : ; v^^a - T xor: 'CVT^VEDDS 
■Ri*j hhkpassphskpfpilqa* > ffs^„:acm\K 

QRLAKGRIQARRNIKMQRIAALLDJJ'IQPTIEVLG 
FGLGSSTSTQHLPFRFYDQGRRGSSOrr^QRSVF 
ASVDKVPGFAVAQCINEHSSPSLSSQSPP3AJ3GSP 
SGSGSTSHCDSGGTSSSSTPSTAQSPAGIGHVTQ 


3597 


A 


427 


277 


GVRRIQHHWAQMHECNVrnYASI^COlXHTG 
KLCCLNSHRHFHCIKYSK 


3598 


A 


1 


503 


FRPRTKKATAMYLEHYLDSIENLPCELQRNFQL 

MREODQRTEDKKAEIDILAAEYISTVKTLSPDQR 

VERLQKIQNAYSKCKEYSDDKVQLAMQTYEMV 

DKHIRRLDADLARFEADLKDKMEGSDFESSGGR 

GLKKGRGQKEKRGSRGRGRRTSEEDTPKKKKH 

KGG 


3599 


A 


2 


3907 


KTITALAFSPDGKYLVTGESGHMPAVRVWDVAE 

HSQVAELQEHKYGVACVAFSPSAKYWSVGYQH 

DMIVNVWAWKKNIVVASNKVSSRVTAVSFSED 

CSYFVTAGNRHIKFWYLDDSKTSKVNATVPLLG 

RSGLLGELRKNLFTDVACGRGKKADSTFCITSSG 

LLCEFSDRRLLDKWVELRVYPEVKDSNQACLPP 

SSFrrcSSDOTIRLWNTESSGVHGSTLHRNILSSDL 

IKIIYVDGNTQALLDTELPGGDKADASLLDPRVGI 

RSVCVSPNGQHLASGDRMGTLRVHELQSLSEML 

KVEAHDSEILCLEYSKPDTGIJCLLASASRDRLIH 

VIJ)AGREYSlX3QTLDEHSSSrrAVKFAASDGQVR 
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SEQJD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acfd residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-AJanlnc OCysteine, D^Aspartic Add, 
E=Glutamie Add, ^Phenylalanine, G=Gr/tine, H=Histidine, 
I^Isoleudne, K^Lysxne, L^Leudne, ^Methionine, 
N=Asparagine, P«=Prollne, Q=Glutamine, R=ArgJntoe, S-Serine, 
T=Threonlne, V=Valinc, W-Tryptophan, Y^Tyroslne, 
X^Unknown, *=Stop codon, /^possible nadeotide deletion, 
V=possiWe nucleotide insertion 










MISCGADKSIYFRTAQKSGIXjVQFTRTHHVVRK 

TTLYDMDVEPSWKYTAIGCQDKNIRIFNISSGKQ 

KKLHCGSQGEIX5TUKVQTDPSGIYIATSCSDKNL 

SIFDFSSGECVATMFGHSEIVTGMKFSNDCKHLIS 

VSGDSCIFVWRLSSEM11SMRQRLAELRQRQRGG 

KQQGPSSPQRASGPNRHQAPSMLSPGPALSSDSD 

KEGEDEGTEEET PALPVLAKSTKKALASVPSPAL 

PRSLSHWEMSRAQESVGFLDPAPAANPGPRRRG 

RWVQPGVEI^VRSMLDLRQLETLAPSLQDPSQD 

SLAIIPSGPRKHGQEALETSLTSQNEKPPRPQASQ 

PCSYPHIIRLLSQEEGVFAQDLEPAPIEDGIVYPEP 

SDNPTMDTSEFQVQAPARGTI^RVYPGSRSSEK 

HSPDSACSVDYSSSCLSSPEHPTEDSESTEPLSVD 

GISSDLEEPAEGDEEEEEEEGGMGPYGLQEGSPQ 

TPDQEQFLKQHFETLASGAAPGAPVQVPERSESR 

SISSRFLLQVQTRPLREPSPSSSSLALMSRPAQVPQ 

ASGEQPRGNGANPPGAPPEVEPSSGNPSPQQAAS 

VLLPRCRLNPDSSWAPKRVATASPFSGLQKAQS 

VHSLVPQERHEASLQAPSPGALLSREIEAQDGLG 

SLPPADGRPSRPHSYQNPTTSSMAKISRS1SVGEN 

LGLVAEPQAHAPIRVSPl^KLALPSRAHLVLDIPK 

PLPDRFELAAFSPVTKGRAPGEAEKPGFPVGLGK 

AHSTTERWACLGEGTTPKPRTECQAHPGPSSPCA 

QQLPVSSLFQGPENLQPPPPEKTPNPMECTKPGA 

ALSQDSEPAVSLEQCEQLVAELRGSVRQAVRLY 

HSVAGCKMPSAEQSRIAQLLRDTFSSVRQELEAV 

AGAVLSSPGSSPGAVGAEQTQALLEQYSELLLRA 

VERRMERKL 


3600 


A 


1688 


916 


IPGSTISCSMALCEAAGCGSALLWPRLLLFGDSIT 
QFSFQQGGWGASLADRLVRKCDYLNRGFSGYN 

VlIJTPTTIXJETAWEEQCnQGCKLNRLNSVVGEY 
AJ^V^OVAQDCGTDVLDLWTLMQDSQDFSSYL 
SIXjLHLSPKGNEFIJFSHLWPLIEKKVSSLPLLLPY 
WRDVAEAKPELSLLGDGDH 


3601 


A 


44 


223 


VHFPLIPQLAKCFWTMNRAARNKSEKRYYSEFL 
QIAHLFNYGLSSFLREFIIFLIKLLQ 


3602 


A 


37 


1124 


VPKPASGKRRLEFRPQDSKACAATPHSPGRITSR 

TRGSQKVRSVPPRLPWAQASASTDWEGLRGVPG 

PALRRENFLEAAASGRSGRTPTGGVGFRDVGGP 

HFPIFPAAHELWCNLHTPRRPACNAPWHSPVGEI 

SPPPRESQLRRDPEVHFESPAHPLGFRLLPGRGLP 

ANAVTVETAAMAAPRQIPSHIVRLKPSCSTDSSF 

TRTPVPTVSLASRELPVSSWQVTEPSSKNLWEQI 

CKEYEAEQPPFPEGYKVKQEPVTTVAPVEEMLFH 

GFSAEHYFPVSHFTMISRTPCPQDKSETINPKTCS 

PKEYl^TrWVLLPGMASLLHQAKKEKCFEVVL 

QMTPSGGKACVWGHLPSSSHTI 


3603 


A 


286 


587 


NISNKAEVSSHPSVISHSMDSFGQPRPEDNQSVLR 

RMQKKYWKTKQVFIKATGKKEDEHLVASDAEL 

DAKIJBVFHSVQETCTCIXKIffiKYQLRLNGMKS 


3604 


A 


103 


2440 


QPRRRVFPAAGRGPGRKCSQWGRQASVSFEDVT 

VDFSKEEWQHLDPAQRRLYWDVTLENYSHLLS 

VGYQIPKSEAAFKLEQGEGPWMLEGEAPHQSCS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine OCystetae, D»Aspartic Acid, 
E^Glutamic Add, ^Phenylalanine, G^GIydne, H=Histidine, 
I^lsoleudne, K=Lysine, LHLeudne, M^Methlonine, 
N^Asparaglne, P=Prollne, Q=Glntamine, R°Arginine, S=Serine, 
•^Threonine, V=>Valine, \SMTryptophan, Y«Tyrosine, 
X^Unknown, *=Stop codon, /^possible nadeotide ddetion, 
V=possible nudeotfde insertion 










GEAIGKMQQQGIPGGIFFHCERFDQPIGEDSLCSI 

IJEELWQDNDQLEQRQENQNNLLSHVKVLIKERG 

YEHKNIEKIIHVTTKLVPSIKRIJIN 

SHNHNRNSATKNLGKIFGNGNNFPHSPSSTKNEN 

AXTGANSCEHDHYEKHl^HKQAPTHHQKIHPEE 

KLWCTECVMGFTQKSHLFEHQRIHAGEKSREC 

DKSNKVFPQKPQVDVHPSVYTGEKPYLCTQCGK 

VFTLKSNLITHQKIHTGQKPYKCSECGKAFFQRS 

DLFRHLRIHTGEKPYECSECGKGFSQNSDLSIHQ 

KTrTrGEKHYECNECGKAFrRKSALRMHQRIHTG 

EKPYVCADCGKAHQKSHFNTHQRIHTGEKPYEC 

SDCGKSFTKKSQLHVHQRIHTGEKPYICTECGKV 

FTHRTh^TTHQKTHTGEKPYMCAECGKAFTDQS 

NLIKHQKTOTGEKPYKCNGCGKArWKSRIJaH 

QKSfflGERHYECKDCGKAFIQKSTLSVHQRIHTG 

EKPYVCPECGKAHQKSHFIAHHRIHTGEKPYECS 

DCGKCFITCKSQIJIVHQKJHTGEKPNICAECGKAF 

TDRSNLITHQKfflTREKPYECGIX:GKTFTWKSRL 

NIHQKSHTGERHYECSKCGKAFIQKATLSMHQn 

HTGKKPYACTECQKAFTDRSNLJKHQKMHSGEK 

RYKASD 


3605 


A 


3 


322 


SFRMSGRGKGGKGLGKGGAKRHRKVLRDNIQG1 
TKPAIRRLARRGGVKRISGLIYEETRGVLKVFLEN 
VIRDAVTYTEHAKRKTVTAMDVVYALKRQGRT 
LYGFGG 


3606 


A 


1 


1749 


VPVTAEAKLMGFTQGCVTFEDVAIYFSQEEWGL 

LDEAQRLLYRDVMLENFALITALVCWHGMEDE 

ETPEQSVSVEGVPQVRTPEASPSTQKIQSCDMCV 

PFLTDILHLTDLPGQELYLTGACAVFHQDQKHHS 

AEKPLESDMDKASFVQCCLFHESGMPFTSSEVG 

¥JWL>\ PLGTT-Cr^: V ANYEi^KKJS^^E AFH\ 3: 

SHYKWSQC ^ibSL^lQtn'FFHPRVCTGKRLYESS 

KCGKACCCECi LVQLQRVHPGERFYECSECGKS 

FSQTSHLKDHKRJHTGERPYVCGQCGKSFSQRAT 

IJKHroVHTGERPYEOjECGKSFSQSSNLIEHCRI 

HTGERPYECDECGKAFGSKSTLVRHQRTHTGEK 

PYECGECGKLFRQSFSLVVHQRIHTTARPYECGQ 

CGKSFSLKCGLIQHQLIHSGARPFECDECGKSFSQ 

RTTLNKHHKVHTAERPYVCGECGKAFMFKSKL 

VRHQRTHTGERPFECSECGKFFRQSYTLVEHQKI 

HTGLRPYDCGQCGKSFIQKSSLIQHQWHTGERP 

YECGKCGKSFTQHSGLILHRKSHTVERPRDSSKC 

GKPYSPRSNIV 


3607 


A 


92 


331 


AMAGPGPGPGDPDEQYDFLFKLVLVGDASVGKT 
CWQRFKTGAFSERQGST1GVDFTMKTLEIQGKR 
VKLQIWDTAGQER 


3608 


A 


545 


379 


AIKGYIHLSAPRNRYMHTTASNGRMLFMKVTM 
YMRRGVQIMGWSVRMAFMACFTQ 


3609 


A 


118 


873 


VWMAWQVSLLELEDRLQCPICLEVFKESLMLQC 
GHSYCKGCLVSLSYHLDTKVRCPMCWQVVDGS 
SSLPNVSIAW\OEALRLPGDPEPKVCVHHRNPLS 
IJCEKDQELICGLCGLLGSHQHHPVTPVSTVCSR 
MKEELAALFSELKQEQKKVDELIAKLVKNRTRIV 
NESDVFSWVIRREFQELRHPVDEEKARCLEGIGG 
HTRGLVASLDMQLEQAQGTRERLAQAECVLEQF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
seqaencc 


Amino add sequence (A- Alanine OCysteine, D^Aspartic Add, 
EMJlutamic Add, ^Phenylalanine, G=Glyclne, H=Hlstidine, 
I=Iso!eudne, K=Lysint, L^Leudne, M=Methionlne, 
N^Asparagine, P=ProUne» Q=Gtatamlne, R=Arginine, S=Serine, 
T^Threonine, V-Vallne, W-Tryptophan, Y^Tyrosine, 
X»Unknown t *=Stop codon, /=possib!e nudeotide deletion, 
\=possible nudeotide insertion 










GNEDHHEFIWKFHSMASR 


3610 


A 


2 


987 


DPRVRPPLLQPPPPIJLPRL\aUCMAPLDU)KYV^ 

ARLCKYLPENDLKRLCDYVCDLLLEESNVQPVS 

TPVTVCGDfflGQFYDUIELniTGGQVPDTOYIFM 

GDFVDRGYYSLETrTYLLAIJCAKWPDRITLLRG 

NHESRQITQVYGFYDECQTKYGNANAWRYCTK 

VFDMLTVAALIDEQ1LCVHGGLSPDIKTLDQIRTI 

ERNQEIPHKGAFCDLVWSDPEDVDTWAISPRGA 

GWLFGAKVTNEFVHINNIJKLICRAHQLVHEGYK 

rWDEKLVTVWSAPNYCYRCGNIASIMVFK^ 

TREPKLFRAVPDSERVIPPRTTTPYFL 


3611 


A 


2459 


869 


AEKMTAELREAMALAPWGPVKVKKEEEEEENF 

PGQASSQQVHSENIKVWAPVQGLQTGLDGSEEE 

EKGQNISWDMAWLKATQEAPAASTLGSYSLPG 

TlJUCSEIUrraGTMOTLGAETKN^ 

EEAEKPLESERIQKADPQGPELGEACEKGNMLK 

RQRIKREKKDFRQVIVNDCHLPESFKEEENQKCK 

KSGGKYSLNSGAVKNPKTQLGQKPFTCSVCGKG 

FSQSANLVVHQRIHTGEKPFECHECGKAFIQSAN 

LVVHQRIHTGQKPYVCSKCGKAFTQSSNLTVHQ 

KIHS1.EKTFKCNECEKAFSYSSQLARHQKVHITE 

KCYECNECGKTFTRSSNLIVHQRIHTGEKPFACN 

DCGKAFTQSANLIVHQRSHTGEKPYECKECGKA 

FSCFSHLIVHQRIHTAEKPYDCSECGKAFSQLSCL 

IVHQRIHSGDLPYVCNECGKAFTCSSYLLIHQRIH 

NGEKPYTCNECGKAFRQRSSLTVHQRTHTGEKP 

YECEKCGAAFISNSHLMRHHRTHLVE 


3612 


A 


318 


2245 

* 

• 


SPMAEAALVNTPQIPMVTEEFVKPSQGHVTFEDI 

AVYFSQEEWGLLDEAQRCLYHDVMLENFSLMA 

SVGCLHGIEAEEAiPSEQTLSAQGVSQARTPKLGP 

~IPNAH3C£fv ICV YMKDILYLSEIIQGTLPV/QKPY 

i JVASGKWFSFGSNLQQHQNQDSGBKE I ;tf03ESS 

ALU.NSCKIPI^DMJ^CKDVEKDFPTILGLLQHQ 

TTHSRQEY AHRSRETFQQRRYKCEQ VFNEKVHV 

rEHQRVHTGEKAYKRREYGKSLNSKYLFVEHQR 

THNAEKPWChnCGKSFLHKQTLVGHQQRIHTRE 

RSYVCIECGKSLSSKYSLVEHQRTHNGEKPYVCN 

VCGKSFRHKQTFVGHQQRIHTGERPYVCMECGK 

SFTHSYDRIRHQRVHTGEGAYQCSECGKSFIYKQ 

SU^HHRIHTGERPYECKECGKAF 

RIHTGEKPYVCnCGKSFIRSSDYMRHQRIHTGER 

AYECSDCGKAnSKQTLLKHHKIHTRERPYECSE 

CGKGFYLEVKLLQHQRIHTREQLCECNECGKVF 

SHQKRLLEHQKVHTGEKPCECSECGKCFRHRTS 

LIQHQKVHSGERPYNCTACEKAFIYKNKLVEHQ 

RIHTGEKPYECGKCGKAFNKRYSLVRHQKVHrr 

EEP 


3613 


A 


817 


3345 


NQSHPDSETVTVEGGRRKMKSNQERSNECLPPK 

KREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGN 

PGGRGHGGGRHGPAGTSVELGLQQGIGLHKALS 

TGLDYSPPSAPRSVPVATELPAAYATPQPGTPVSP 

VQYAHLPHTFQFIGSSQYSGTYASFIPSQLPPTAN 

PVTSAVASAAGATTPSQRSQLEAYSTLLANMGS 

LSQTPGHKAEQQQQQQQQQQQQQQQQQQQQQ 

QQQHQQQQQQQQQQQQQQHLSRAPGLITPGSPP 
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SEQJDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysfeine, D=>Aspartic Add, 
E>=Glutamie Add, ^Phenylalanine, G=Glycinc, H=HistiqJne, 
I=IsoIeudne, K^Lysine, I^Leudne, M^Methlonlne, 
N^Asparagine, P^ProIlne, Q=Glutamine, R^Argininc, S=Serinc, 
T=Threonine, V«Vaiine, W-Tryptophan, Y-Tyrosinc, 
X^nknown, *«Stop codoo, /=possibIe nudeotide deletion, 
V^possiMe nudeotide insertion 










PAQQNQYVmSSSPQNTGRTASPPAIPVHLHPHQ 

TMIPHTLTLGPPSQVVMQYADSGSHFVPREAIK 

KAESSRLQQAIQAKEVLNGEMEKSRRYGAPSSA 

DLGLGKAGGKSVPHPYESRHVWHPSPSDYSSR 

DPSGVRASVMVLPNSNTPAADLEVQQATHREAS 

PSTLNDKSGLHLGKPGHRSYALSPHTVIQTTHSA 

SEPLPVGLPATAFYAGTQPPVIGYLSGQQQAITY 

AGSLPQHLVIPGTQPLLIPVGSTDMEASGAAPAIV 

TSSPQFAAWHTFVTTALPKSENFNPEALXTQAA 

YPAMV QAQIHLPVVQS VASPAAAPPTLPPYFMK 

GSIIQLANGELKKVEDLKTEDFIQSAEISNDLKIDS 

STVERIEDSHSPGVAVIQFAVGEHRAQVSVEVLV 

EYPFFVFGQGWSSCCPERTSQLFDLPCSKLSVGD 

VCISLTLKNLKNGSVKKGQPVDPASVLLKHSKA 

DGLAGSRHRYAEQENGINQGSAQMLSENGELKF 

PEKMGI^AAPFLTKIEPSKPAATRKRRWSAPESR 

KLEKSEDEPPLTLPKPSLIPQEVKICIEGRSKVGK 


3614 


A 


3 


114 


FFESRLRCKCCEPRGSWARFGCWRLQPEFKPKQ 
LEG 


3615 


A 


3 


1603 


DAWALTNQFSDSKQHIEVLKESLTAKEQRAAILQ 

TEVDALRLRLEEKETMLNKKTKQIQDMAEEKGT 

QAGEIHDLKIDMLDVKERKVNVIXJKKIENLQEQL 

RDK^KQMSSLKERVKSLQADTTNTDTALTTLEE 

ALAEKERTIERLKEQRDRDEREKQEEEDNYKKDL 

KDLKEKVSLLQGDLSEKEASLLDLKEHASSLASS 

GLKKDSRLKTLEIALEQKJKJB^CLKMESQLKKAH 

EAALEARASPEMSDRIQHLEREITRYKDESSKAQ 

AEVDRIXEILKEVENEKNDKDKKIAELESLTSRQ 

VKDQNKKVANLKHKEQVEKKKSAQMLEEARRR 

EDNLNDSSQQLQDSLRKKDDRIFELEEALRESVQ 

ITAEREMVT^.QEESARTNAEKQv !i.JJ.1Aj.ZX^* 

KQELESMKAKLSSi k ^QSLAEKFiTlLTOLRAERR 
KHLEEVLEMKQEALLAAISEKDANIALLELSSSK 
KXTQEEVAALKREKDRLVQQLKQQIXJNRMKLM 
ADNYEDDHFKSSHSNQTNHKPSPDQDEEEGIWA 


3616 


A 


244 


1420 


RRRWRARGGLVPTLAWAEATGAYVPGRDKPDL 

PTWKRNFRSAIJJRKEGLR^ 

YEFVNSGVGDFSQPDTSPDTNGGGSTSDTQEDIL 

DELLGNMVLAPLPDPGPPSLAVAPEPCPQPLRSPS 

LDM>TPFPNLGPSENPlJm.LVPGEEWEIWrAF 

YRGRQVFQQUSCPEGLRLVGSEVGDRTLPGWP 

VTLPDPGMSLTDRGVMSYVRHVLSCLGGGLAL 

WRAGQWLWAQRl^HCHTYWAVSEELLPNSGH 

GPDGEVPKDKEGGVFDLGPFIVGSLGPPDLITFTE 

GSGRSPRYALWFCVGESWPQDQPWTKRLVMVK 

VVPT(XRALVEMARVGGASSLENTWLHISNSHP 

LSLTSDQYKAYLQDLVEGMDFQGPGES 


3617 


A 


852 


304 


RGGLLSKMARVLKAAAANAVGLFSRLQAPIPTV " 

RASSTSQPLDQVTGSVWNLGRLNHVAIAVPDLE 

KAAAFYKNILGAQVSEAVPLPEHGVSVVFVNLG 

NTKMELLHPLGRDSPIAGrXQKNKAGGMHHICIE 

VDNINAAVMDLKKKKmSLSEEVKIGAHGKPVff 

LHPKDCGGVLVELEQA 


3618 


A 


3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 
DDDMEGDEAVVRCTLSANMYVDEDLVWCASEL 
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SEQID 
NO: 



Method 



Predicted 

beginning 

nncleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino add sequence (A«=Alanine Cysteine, Eh-Aspartic Acid, 
E=GlutamicAdd, F=PhenyIalanlne, G=Glydne, H-Histidine, 
MsoJeudue, K«=Lysine, LpLeudne, n£=MetnJonlne, 
N^Asparagine, PHProUne, Q=Glutamine, R=Arginine, S=Serint, 
T=Threonine, V-Vallne, W=Tryptophan, Y^Tyrosine, 
X«Unknown, **3top codon, ^possible nudeotide deletion, 
^possible nudeotide insertion 



NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 
GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 
SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 
VLLXGFNTTOF1KVLRQHRMMILYCTLLASAQSE 
AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 
ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 
V1J)I^LVFTQGSHFMANKRCQLPIXjSFRRQRK 
GYEBVHVPALKPKPFGSEEQLLPVEKLPKYAQA 
GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 
GKT>TVAIJ4CMLREIGKiiINMIX3TINW 
APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 
LCKEEISATQIIVCTPEKWDIITRKGGERTYTQLV 
RLHLDEIHLLHDDRGPVLEALVARAIRNIEMTQE 
DVRIJGLSATLPNYEDVATFLRVDPAKGLFYFDN 
SFRPWLEQTYVGITEKKA1KRFQIMNEIVYBKIM 
EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 
TLGLFLREGSASTEVIJITEAEQCKNLELKDLLPY 
GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 
TIAWGVNLPAHTVIIKGTQVYSPEKGRWIELGA 
LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 
SLLNQQLPBBSQMVSKLPDMLNAEIVLGNVQNA 
KDAVNWLGYAYLY1RMLRSPTLYGISHDDLKGD 
PLLDQRIO,DLVHTAAU^KNNLVKYDKKTGN 
FQVTELGR1ASHYYITNDTVQTYNQLLKPTLSEIE 
LFRWSLSSEFKNITVREEEKLELQKLLERVPIPVK 
ESIEEPSAKINVLLQAFISQLKLEGFALMADMVY 
VTQSAGRLMRAIFEIVLNRGWAQLTDKTLNLCK 
MIDKRMWQSMCPLRQFRKLPEEVVKKIEKKNFP 
FERLYDLNHNE1GELIRMPKMGKTIHKYVHLFPK 
LELSVHLQPITRSTLKVELTITPDFQWDEKVHGSS 
! BATW* .VEEVT : ■ T rttfflETTLL" * ^ r AQDEHLI 
TFFVPVFEFi 1 i i ; ^ VVSDRWLSCETQLPVSFR 
HLELPEKYPPPr^LLDLQPLPVSALRNSAFESLYQ 
DKFPFFNPIQTQYi^TVYNSDDNVFVGAPTGSGK 
TICAEFADJlMIXQNSEORCWrn^MRLWQEQVY 
MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 
>^^STPEKWDE^RRWKQRKNVQNI^^JFVYDEV 
HUGGENGPVLEVICSRMRYISSQIERPIRIVALSSS 
LSNAKDVAHWIXK^SATSTFNFHFNVRPVPLELHI 
(^FhnSrntJTRlXSMAKPVFHMTKHSPKKPVIW 
VPSRKQTRLTATOILTTCAADIQRQRFLHCTEKDL 
IPYIJ^SDSTIJaETUJ^GVGYIilEGLSPMER^ 
VEQLFSSGAIQVWASRSLCWGMNVAAHLVnM 
DTLYYNGKMAYVDYPIYDVLQMVGHANRPLQ 
DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 
HCMHDHFNAEIVTKTIENKQDAVDYLTWTFLYR 
RMTQNPNYYl^QGISHRHLSDHLSELVEQTLSDL 
EQSKCISIEDEMDVAPLNLGMIAAYYYINYTTIEL 
FSMSLNAKTKVRGLIEIISNAAEYENIPIRHHEDN 
LLRQLAQKWHKI2WKFNDPHVKTNLLLQAHL 
SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 
WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 
PSGIJFKRCTDKGVESVTOIMEMEDEERNALLQLT 
DSQIADVARFCNRYPNIELSYEWDKDSIRSGGP 
VVVLVQLEREEEVTGPVIAPLFPQKREEGWWW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-Aianine C=Cystdnc, D=Aapartic Add, 
E=Glntamlc Add, F=Phenylalanine, G^Glydne, H^Hlstidlne, 
I=IsoIeudne, K^Lyslne, D=Leudne, M=Meihionine, 
N=»Asparagine, P^Prollne, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valinc, W»Tryptophan, Y=Tyrosine, 
X*=Uoknown, *=Stop eodon, Azpossible nudeotide ddetion, 
^possible nudeotide insertion 










IGDAKSNSUSIKRLTLQQKAKVKLDFVAPATGG 
RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 
DSD 


3619 


A 


3 


5992 

i 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 

DDDMEGDEAVVRCTLSANMYVDEILVWCASEL 

NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 

GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 

SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 

VLL1XjF>HTOFIKYLRQHRMMILYCTLLASAQSE 

AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 

ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 

VIJDIJEDLVFn^SHFMA>IKRCQLPDGSFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA ' 

GKTm^ALMCmmGKHINMDGTINVDDIlC^ 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEEISATQIIVCTPEKWDnTRKGGERTYTQLV 

RLIILDEIHIXHDDRGPVLEALVARAIRNIEMTQE 

DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 

SFRPWI^QTYVGITEKKAIKRFQIMNEIVYEKM 

EHAGKNQVLVFVHSRKETGKTARA1RDMCLEKD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTVIIKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRRLDLVHTAAOvlLDKNNLVKYDKKTGN 

FQVTELGRIASHYYITNDTVQTYNQLLKPTLSEIE 

I^RVFSI^^EFKNTTVREEEKLELQKLLERVPIPVK 

ESEEEPSA1 *VL7. QAHSOI T^t" "/ "'-MADXviVY 

\ V>SAGRL2v®AIFEIVL^ 

MIDKRMWQSMCPLRQFRKLFE. VVKKIEKKNFP 

FERLYDUsIHNEIGELIRMPKMGKTUX 

LEl^VHLQPITRSTLKVELTITPDFQWi. v 3KVHGSS 

EAFWILVEDVDSEVILHHEYFIJLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRWSDRWLSCETQLPVSFR 

HLILPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

DKFFrTNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

TICAEFAKJRMLLQNSEGR(7VYITPMRLWQEQVY 

MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 

MHSTPEKWDILSRRWKQRKW^^ 

fflJGGENGPVLEVICSRMRYISSQIERPIRIVALSSS 

LSNAKDVAHWlXK^SATSTFNFrn'NVRPVPLELHI 

QGFMSHTQI1UJLSMAKPVFHMTKHSPKKPVIVF 

VPSRKQTRLTAIDILTTX3AADIQRQRFLHCTEKDL 

PYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQIjreSGAIQVWASRSLCWGMNVAAHLVIIM 

DTLYYNGKIHAYVDYPIYDVLQMVGHANRPLQ 

DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 

HCMHDHFNAEIVTKTIENKQDAVDYLTWTFLYR 

RMTQNPNYYNLQGISHRHLSDHLSELVEQTLSDL 

EQSKCISIEDEMDVAPLNLGMIAAYYYINYTTIEL 

FSMSLNAKTKVRGUEnSNAAEYENIPIRHHEDN 

LLRQLAQKWHKLNNPKJ^PHVKTNIXI^AHL 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue or 
peptide 
sequence 


Amino acid sequence (A»AlaniQc C^Cysteine, D=Aspartk Add, 
E=Glutamlc Add, {^Phenylalanine, G=Giycine, KNffistidint, 
£=hoIendne, K«Lyslne, L=Leudne, M=MetbionJne, 
N=Asparagine, P-Prollne, Q=Glutamlne, R=ArgJnine, S"Serinc, 
T=Threonine, V»Valine, W»Tryptophan f Y^Tyrosine, 
X«Unknown, *=Stop codon, £*possfble nucleotide ddetion, 
V^possible nudeotide insertion 










SRMQLSAELQSDTEEII^KAIRIJQACVDVLSSNG 

WI^PAIAAMEIAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTOKGVESVFDIMEMEDEERNALLQLT 

DSQIADVARFCNRYPNIELSYEVVDKDSIRSGGP 

VVVLVQLEREEEVTGPVIAPLFPQKREEGWWW 

IGDAKSNSUSIKRLTLQQKAKVKLDFVAPATGG 

RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 

DSD 


3620 


A 


1205 


323 


VIKMAIAARLLPQFLHSRSLPCGAVRLRTPAVAE 

VRLPSATLCYFCRCRLGLGAALFPRSARALAASA 

LPAQGSRWPVLSSPGLPAAFASFPACPQRSYSTE 

EKPQQHQKTKMIVLGFSKPINWVRTRIKAJF^ 

YFDKEFSITEFSEGAKQAFAHVSKLLSQCKFDLL 

EELVAKEVlJiAIJCEKVTSLPDNrnCNALAANIDEI 

VFreTGDISIYYDEKGRKFVNILMCFWYLTSANIP 

SETLRGASVFQVKLGNQNVETKQLLSASYEFQR 

EFTQGVKPDWTIARIEHSKLLE 


3621 


A 

... 

A- 


2 

•* 


2995 


SSSRSRHSSISPVRU>LNSSIX3AELSRjKJCKERAAA 

AAAAKMDGKESSYERSGSYSGRSPSPYGRRRSSS 

PFLSKRSLSRSPLPSRKSMKSRSRSPAYSRHSSSH 

SKKKRSSSRSRHSSISPVRLPLNSSLGAELSRKKK 

ERAAAAAAAKMDGKESSYERSGSYSGRSPSPYG 

RRRSSSPFLSKRSLSRSPLPSRKSMKSRSRSPAYS 

RHSSSHSKKKRSSSRSRHSSISPVRLPLNSSLGAEL 

SRKKKERAAAAAAAKMDGKESKGSPVFLPRKE 

NSSVEAKDSGLESKKLPRSVKLEKSAPDTELVNV 

THLNTEVKNSSDTGKVKLDENSEKHLVKDLKAQ 

GTRJDSKPIALKEEIVTPKETETSEKETPPPLPTIASP 

PPPLPTTTPPPQTPPLPPLPPIPALPQQPPLPPSQPA 

FSQVPASSTSTLPPSTHSKTSAVSSQANSQPPVQV 

^VKTOVSVTAAIPHLKTSVi >Pl ,pr 7. LPCDr r ^! 

DSPKETLPSievKXEKEQR 

GDLSPPDSPEPKAIIPrXJQPYKKRPKICCPRYGCR 

RQTESDWGKRCVDKFDnGnGEGTYGQVYKAKIi 

KDTGELVAIjQCVRLDNEKEGFPITAIREIKILRQL 

IHRSVVNMKEIVTDKQDAIX)FKKDKGAFYLVFE 

YMDHDLMGIXESGLVHFSEDHIKSFMKQLMEGL 

EYCHKKNFLHRDKCSNIIXNN^ 

RLYNSEESRPYTNKVITLWYRPPKLLLGEERYTP 

AroVWSCGCILGEUnrKKPIFQANLELAQLELISR 

LCGSPCPAVWPDVIKlJ > YinmiKPKKQYiaa(LR 

EEFSFIPSAALDLUDHMLTLDPSKRCTAEQTLQSD 

FLKDVELSKMAPPDLPHWQDCHELWSKKRRRQ 

RQSGVWEEPPPSKTSRKETTSGTSTEPVKNSSPA 

PPQPAPGKVESGAGDAIGLADITQQLNQSELAVL 

IJsJLLQSQTDLSIPQMAQLLNIHSNPEMQQQLEAL 

NQSISALTEATSQQQDSETMAPEESLKEAPSAPVI 

LPSAEQTTLBASSTPADMQNILAVLLSQLMKTQE 

PAGSLEENNSDKNSGPQGPRRTPTMPQEEAAGRS 

NGGNAL 


3622 


A 


16 


390 


TPERGSAYPETAAVRRPAGECPITMSDLEAKLST 
EHLGDKIKDEDIKLRWGQDSSEfflFKVKMTTPLK 
KIJOCSYCQRQGWVNSLRFLFEGQRIADNHTPEE 
LGMEEEDVIEVYQEQIGGHSTV 


3623 


A 


2 


1544 


PPPAPGPDGLNEGCLHRLSMPHQRPRTCAMNPE 
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SEQW 
NO; 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A*=»AJanine OCysteine, 0=>Aspartic Add, 
E^Glutamlc Add, ^Phenylalanine, G=Gtycine, H^flistldlne, 
I^Isoleudne, K=Lysine, LHLeudne, M=Methlonlnt, 
N^Asparagine, P=Prollne, (^Glutamlne, R=Arginint, S«Serine, 
T=Threonine, V-Valine, W»Tryptophan, Y=Tyrosine, 
X^Unknowo, *»Stop codon, /^possible nucleotide deletion, 
V^possible nudeotide insertion 










LTMESLGTLHGARGGGSGGGGGGGGGGGGGGP 

GHEQELLASPSPHHARRGPRGSLRGPPPPPTAHQ 

ELGTAAAAAAAASRSAMVTSMASBLDGGDYRPE 

I^IPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQP 

LPPISTVSDKFHHPHPHHHPHHHHHHHHQRLSGN 

VSGSFTLMRDERGLPAMNNLYSPYKEMPGMSQS 

LSPLAATPLGNGLGGLHNAQQSLPNYGPPGHDK 

Ml^PNFDAHHTAMLTRGEQHLSRGLGTPPAAM 

MSHLNGLHHPGHTQSHGPVLAPSRERPPSSSSGS 

QVATSGQLEEINTKEVAQRITAELKRYSIPQAIFA 

QRVLCRSQGTLSDLLRNPKPWSKLKSGRETFRR 

MWKWLQEPEFQRMSAUQAACKRKEQEPNKDR 

NNSQKKSRLVFTDLQRRTLFAIFKENKRPSKEMQ 

ITOQQLGLELTTVSNFFMNARRRSLEKWQDDLS 

TGGSSSTSSTCTKA 


3624 


A 


27 


2152 


SARKAEAATSGTAARDGSVGRNLVPPPSASAPK 

AEVESNEKDNRPEEEEQVIHEDDERPSEKNEFSR 

RKRSKSEDMDNVQSKRRRYMEEEYEAEFQVKTT 

AKGDINQKLQKVIQWLLEEKLCALQCAVFDKTL 

AELKTRVEKIECNKRHKTVLTELQAKIARLTKRF 

EAAKEDLKKRHEHPPNPPVSPGKTVNDVNSNNN 

MSYRNAGTVRQMLESKRNVSESAPPSFQTPVNT 

VSSTNLVTPPAWSSQPKLQTPVTSGSLTATSVLP 

APNTATWATTQVPSGNPQPTISLQPLPVILHVPV 

AVSSQPQLLQSHPGTLVTOQPSGNVEFISVQSPPT 

VSGLTKNPVSLPSLPhfPTKIWrWSWSPSI^ 

TASAAPLGTTLAVQAVPTAHSIVQATRTSLPTVG 

PSGLYSPSTNRGPIQMKIPISAFSTSSAAEQNSNTT 

PRIENQTNKT1DASVSKKAADSTSQCGKATGSDS 

SGVTDLTMDDEESGASQDPKKLNHTPVSTMSSSQ 

FVSRPLQP:Q?A^LQPSGVi^G^ , :7rTI!!! 7JPTA. 

I .TVNVTHRPVTQ - ; V ,TT 

U>APPAQAP1JIGTVMQAPAVRQVNPQNSVTVRV 

PQTTIYVVNNGLTLGSTGPQLTVHHRPPQVHTEP 

PilRVHPAPLPEAPQPQRLPPEAGSTSRPSEATLEV 

SHAFRVKMAIVLVMECPGGGSKLCHC 


3625 


A 


210 


1115 


ASPFLRPQGHDSGEREPFSQTPGLMQPFSIPVQIT 

LC^SRRRQGRTAFPASGKKRETDYSDGDPLDVH 

KRLPSSTGEDRAVMLGFAMMGFSVLMFFLLGTT 

ILKPFMLSIQREESTCTAJOrTroiMDDWLIX^AFTCG 

VHCHGQGKYPCLQVFVNLSHPGQKAI1JHYNEE 

AVQINPKCFYTPKCHQDRNDLLNSALDIKEFFDH 

KNGTPFSCFYSPASQSEDVILIKKYDQMAIFHCLF 

WPSLTLLGGALIVGMVRLTQHLSLLCEKYSTVV 

RDEVGGKVPYIEQHQFKLCIMRRSKGRAEKS 


3626 


A 


9 


921 


SSWEFSALSVSMACLSPSQLQKFQQDGFLVLEG 

FLSAEECVAMQQRIGEIVAEMDVPLHCRTEFSTQ 

EEEQ1JRAQGSTDYFLSSGDKIRFFFEKGVFDEKG 

NFLVPPEKSINKIGHALHAHDPVFKSITHSFKVQT 

LARSLGLQMPVWQSMYIFKQPHFGGEVSPHQD 

ASFLYTEPlAlRVUJVWIAVEDATLENGCLWFIPG 

SHTSGVSRRMVRAPVGSAPGTSFLGSEPARDNSL 

FVPTPVQRGALVLIHGEVVHKSKQNLSDRSRQA 

YTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


3627 


A 


231 


644 


INSSPRTGRDHQELNLHTERDSRSQRAVLKIPRQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=»Aspartic Acid, 
E=Glatamle Add, ^Phenylalanine, OGlydne, H=Histidine, 
I=IsoIeucine, KHLysine, L°Leudne, M^Methionine, 
N-Asparagine, P«Prollne, Q=€lutamine, R«Argiaine, $°Serine» 
T=»Threonine, V=»VaIine, W=Oryptophan, Y«Tyrosine, 
X«=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V-possible nudeotide insertion 










NPGIFYWIFLPSRSHSASHGSRQRQVSCQGTQDEI 
LKMRNTFAELKNSLEALSSRMDQAEERIGTQAG 
VQWRDHGSLQPQPPEFKQCFHLSLPSSWDYRAC 
LS 


3628 


A 


2 


810 


GCKHLLQNSWYDPRVREADRVGQRARRPRAAM 

DWI^GKSKAKPNGKKPAAEERKAYLEPEHTKA 

RITDFQFKELVVLPREroLNEWIJVSNTTTTFHHm 

LQYSTISEFCTGETCQTMAVCNTQYYWYDERGK 

KYKCTAPQYVDFVMSSVQKLVTDEDVFPTKYG 

REFPSSFESLVRKICRHLFHVLAHIYWAHFKETLA 

LEUIGHLNTLYVHFILFAREFNLLDPKETAIMDD 

LTEVLCSGGRRGSTVGAVGMGPAAGAPGAQNH 

VKER 


3629 


A 


699 


1604 


CSHGSSAVSAWSPLFQASEVERQLSMQVHALRE 

DFRBKNSSTNQHIIRLESLQAEIKMLSDRKRELEH 

RLSATLEENDLLQGTVEELQDRVLILERQGHDKD 

LQLHQSQLELQEVRLSCRQLQVKVEELTEERSLQ 

SSAATSTSLLSEIEQSMEAEELEQEREQLTLLSVE 

MTALKEERDRLRVTSEDKEPKEQLQKAIRDRDE 

AIAKKNAVELELAKCRMDMMSLNSQLLDAIQQ 

KLNLSQQLEAWQDDMHRVIDRQLMDTHLKERS 

QPAAALCRGHSAGRGDEPSIAEGKRLFSFFRKI 


3630 


A 


423 


1 


PAKVLTLDIYLSKTEGAQVDEPWITPRAEDCGD 

WDDMEKRSSGRRSGRRRGSQKSTDSPGADAELP 

ESAARDDAVFDDEVAPNAASDNASAEKKVKSPR 

AALDGGVASAASPESKPSPGTKGQLRGESDRSK 

QPPPASSP 


3631 


A 


2082 


674 


WSGFWQLPGVRGVGSAPGGDGAEFTSRRGSSRR 
PGAACPGCRGAGSERAPGGMGRRRAPELYRAPF 
PLYALQVDPSTGLLIAAGGGGAAKTGIKNGVTIF 

L/-i VGv?DAHCQLLRFQAHQQQC : • KAkL.GSKEQ 

GPRQRKGAAPABKKCGAETQHEC:.ELRVENLQA 

VQTOFSSDPLQKVVOFNHDhTILlATCeTDGY^ 

\HVKVPSLEKVIJBFKAHEGEIEDIA1X}PDG^XVT 

VGRDLKASVWQKDQLVTQLHWQENGPTFSSTP 

YRYQACRFGQVPDQPAGLRLFTVQIPHKRLRQPP 

PCYLTAWDGSNFLPLRTKSCGHEVVSCLDVSES 

GTFLGIX5TVTGSVAIYIAFSLQCLYYVREAHGIV 

VTDVAFLPEKGRGPELLGSHETALFSVAVDSRCQ 

IJILLPSRRSVPVWLLLLLCVGLIIVTILLLQSAFPG 

FL 


3632 


A 


942 


40 


PWCQRVEVRSCGSSKRSCSRWSGSSWDGSRSLG 

RGLNHTSLNRSPPFTTDTMTHCCSPCCQPTCCRT 

TCmTTCWKPTIVTTCSSTPCCQPSCCVPSCCQP 

COnnrcCQNTCCRTTCCQPTCVASCCQPSCCSTP 

CCQPTCCGSSCCGQTSCGSSCCQPICGSSCCQPCC 

HPTCYQTICFRTTCCQPTCCQPTCCRNTSCQPTCC 

GSSCCQPCCHPTCCQTICRSTCCQPSCVTRCCSTP 

CCQPTCGGSSCCSQTCNESSYCLPCCRPTCCQTT 

CYRTTCCRPSCCCSPCCVSSCCQPSCC 


3633 


A 


605 


3004 


GPEGYRGRRARHPSLGSTTGHCGGGRGAEGTGT 
DPAAPAARLNVDGLLVYFPYDYIYPEQFSYMRE 
LKRTLDAKGHGVLEMPSGTGKTVSLLALIMAYQ 
RAYPLEVTKLIYCSRTVPEIEKVIEELRKLIJ4FYE 
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S£QD> 
NO: 


Method 


Predicted 

beginning 

oudeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Alanine OCysteine, D=Aspartic Add, 
E-Glntamlc Add, ^Phenylalanine, G^GIydne, B^Bistidine, 
JHIsolendne, K«Lysine, LHLendne, M-Metfaionine, 
N=*\sparagine, P-Proline, Q^Glntamine, R=Arginlne, S=Serine, 
T-Threonine, V=*Vaiine, W=Tryptophan, Y=Tyrosine, 
X°Unknown, *=Stop codoa, /=possible nndeotlde deletion, 
V= s possible nudeotide insertion 










KQEGEKLPHXjL^SSRKNLCIHPEVTPLRFGKD 

VDGKCHSLTASYVRAQYQHDTSLPHC3U 7 YEEFD 

AHGREVPLPAGIYNLDDLKALGRRQGWCPYFLA 

RYSILHANVVWSYHYLIJDPKIADLVSKELARK 

AVVVFDEAHNIDNVCIDSMSVNLTRRTLDRCQG 

NLETLQKTVLRIKETDEQIU^RDEYRRLVEGLREA 

SAARETDAHLANPVLPDEVLQEAVPGSIRTAEHF 

LGFLRRLLEYVKWRLRVQHVVQESPPAFLSGLA 

QRVCIQRKPLRFCAERLRSLLHTLEITDLADFSPL 

TIXAOTATLVSTYAKGFITIIEPFDDRTPTIANPIL 

HFSCMDASLAIKPVFERFQSVnTSGTLSPLDIYPK 

ILDFHPVTMATFTNfrLARVCLCPMIIGRGNDQVA 

ISSKFETREDIAVIRNYGNLLLEMSAVVPDGIVAF 

rTSYQYMESTVASWYEQGIUMQRNKIXFIETQ 

DGAETSVALEKYQEACENGRGAILLSVARGKVS 

EGIDFVHHYGRAVIMFGVPYVYTQSRILKARLEY 

LR1XJFQIRENDFLTFDAMRHAAQCVGRAIRGKT 

DYGLMVFADKRFARGDKRGKLPRWIQEHLTDA 

NLNLTVDEGVQVAKYFLRQMAQPFHREDQLGL 

SLLSLEQLESEETLKRDBQIAQQL 


3634 


A 


159 


384 


1JCMSSKTASTNNIAQARRTVQQLRLEASIERIKV 
SKASADLMSYCEEHARSDPLLIGIPTSENPFKDKK 
TCIIL ' 


3635 


A 


5 


409 


TELSQLEKAHPPADMGRRKSKRKPPPKKKMTGT 
LETQFTCPFCNHEKSCDVKMDRARNTGVISCTV 
CLEEFQTPITCILGNLGFFQRVGRGLESGPCSSGP 
LCALVQGQSRPEEQVPPSDFCGVRRCRAGFQCQ 


3636 


A 


48 


282 


DHUCSCYQDSHEDPTKMKRFLFLLLTISLLVMVQ 
IQTGLSGQNDTSQTSSPSASSSMSGGIFLFFVANAI 
IHLFCFS 


? 


A 


i 

v : . 


1248 


ARAGSVVG3AAAP.GPPAGC r>RA. 4 RI "^SSPft?.. 

RRRCDWVEDGA* ^.^vJEILNTr/SKFASICrMGA >; i 

ASAI^KEIGPEQFPVNEHYFGLVNFGNTCYCNSV' { 

LQALYFCRPFREKGLAYKSQPRKKESLLTCLADL 

FHSIATQKKKVGVIPPKKFrnaRKENEIJ^NYM 

QQDAHEFLNYU.OT1ADILQEERKQEKQNGRLPN 

G>TONENNNSTPDPTWVHEIFQGTLTOEmCLTC 

ETISSKDEDFLDLSVDVEQNTSITHCLRGFSNTET 

LCSEYKYYCEECRSKQEAHKRMKVKKLPMILAL 

HLKRFKYMDQLHRYTKl^YRVVFPLELRLFNTS 

GDATNPDRMYDLVAVVVHCGSGFNRGHYIAIV 

KSHDFWLLFDDDIVEKIDAQAIEEFYGLTSDISKN 

SESGYILFYQSRD 


3638 


A 


11 


630 


PAGIPVSTISSDRRAS1DLTRKMKPDETPMFDPNL 

IJKEVDWSQNTATFSPAISPTHPGEGLVLRPLCTA 

DLNRGFreVLGQLTETGWSPEQFMKSFEHMKK 

SGDYYVTVVEDVTLGQIVATATLIIEHKFIHSCAK 

RGRVEDVWSDECRGKQLGNLLLSTLTLLSKKL 

NCYKITLECLPQNVGFYKKFGYTVSEENYMCRR 

FLK 




A 


2 


1200 


PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHSSPL 

LPHAMKSPFYRCQNTTSVEKGNSAVMGGVLFST 

GLLGNLLALGLLARSGLGWCSRRPLRPLPSVFY 

MLVCGLTVTDLLGKCIXSPVVLAAYAQNRSLRV 

LAPALDNSLCQAFAFFMSFPGLSSTLQLLAMALE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nudeotide 

location 

corresponding 

to Orst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A«Alanine C=Cysteine, I>=Aspartic Acid, 
E^Glutamlc Acid, ^Phenylalanine, G=Grydnc, B=Hbtidiue, 
I=Iso!endnc, K=Lysine, L=Leudne, M=Methionine, 
N=Asparagine, ^Proline, Q»GIntanilne, R=Arginine, S=Serinc, 
T-Tbreonine, V-Valine, \V~Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, /=possibJe nudeotide deletion, 
V=possibIe nudeotide insertion 










CWI^LGHPFFYRRMTLRIXjALVAPVVSAFSLAF 

CALPFMGFGKFVQYCPGTWCHQMVHEEGSLSV 

LGYSVLYSSLMALLVIATVLCNIXjAMRNLYAM 

HRRLQRHPRSCTRDCAEPRADGREASPQPLEBLD 

HLLLIALMTVLFTMCSI^VIYRAYYGAFKDVKE 

KhniTSEEAEDlJlALRFI^\aSIVDPWIFIIF^ 

IFFHKIFIRPLRYRSRCSNSTNMESSL 


3640 


A 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEY 

AIEAIKLGSTAIGIQTSEGVCLAVEKRITSPLMEPS 

SIEKIVEIDAHIGCAMSGLIADAKTLIDKARVETQ 

NHWFTYNET^fIVESVTQAVSNLALQFGEEDADP 

GAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFV 

QCDARAIGSASEGAQSSLQBVYHKSMTLKEADCS 

SUIIJCQVMEEKI^AT^I^WQPGQNFHMFTK 

EELEEVDCDI 


3641 


A 


2 


1254 


rTGQGGRRAEARSCLLSKAMLGRSGYRALPLGD 

FDRFQQSSFGFLGSQKGCLSPERGGVGTGADVPQ 

SWPSCLCHGLISFLGFLLLLVTFPISGWFALKIVPT 

YERNOVFRLGRIRTPQGPGN1VLLLPFIDSFQRVDL 

RTRAFNVPPCKLASKDGAVLSVGADVQFRIWDP 

VLSVMTVKDLNTATRMTAQNAMTKALLKRPLR 

EIQMEKLKISDQLLLEINDVTRAWGLEVDRVELA 

VEAVLQPPQDSPAGPNLDSTLQQLALHFLGGSM 

NSMAGGAPSPGPADTVEMVSEVEPPAPQVGARS 

SPKQPLAEGLLTALQPFLSEALVSQVGACYQFNV 

VLPSGTQSAYFLDLTTGRGRVGHGVPDGIPDW 

VEMAEADLRALLCRELRPLGAYMSGRLKVKGD 

LAMAMKLEAVLRALK 


3642 


A 


1 


237 


RRGEIDMATEGDVELELETETSGPERPPEKPRKH 
DSGAADLERVTDYAEEKEIQSSNLETAMSV1GDR 


3643 


A 


94 


541 


xv^r AIRRRRRM^^ -. Fil 
il^DLECDYmARSCCSKimWVIPELIGHTlVTV 
il L^4SLHWFIFIXNLPVATWNIYRYIMVPSGNM 
GVPDFIEIHNRGQIJCSHMBCEAMIKIXjFHLLCFF 
MYLYSMELALIND 


3644 


A 


95 


2808 


tscrhfpitsedplnylliltveriyayqalplgfl 

fcsrdpvpeylnhcgvkyvusdrasfcalhiffs 

pfrnvfrpaagggiappprlwfqpslsdaemeipk 

llpargtlqggggggipagggrvhrgpdspagq 

vptrrlllprgpqdggpgrrreeastasrgpgps 

lfaprphqpsgggggggddfflvlldpvggdve 

tagsgqaagpvlreeaeegpglqggesganpag 

ptalgprclsavptpapisapgpaaafagtvtihn 

qdlllrfengvltlatppphawepgaapaqqpg 

cliapqagfphaahpgdcpelppdlllaepaepap 

apapeeeaegpaaalgprgplgsgpgwlylcpe 

au:gqtfakkhqlkmhllthsssqgqrpfkcpl 

ggcgwtfttsyk1krhlxjshdklrpfgcpaegc 

GKSFTTVYNliCAHMKGHEQENSrTCCEVCEESrT 

I yAlULuAHQRSHr EPKKF Y 

ALFSHNRAHFREQELFSCSFPGCSKQYDKACRLK 

IHLRSHTGERPFLCDFDGCGWNFTSMSKLLRHKR 

KHDDDRRFMCPVEGCGKSFTRAEHLKGHSITHL 

STKPFVCPVAGCCARFSARSSLYMSKKHLQDVD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alanine OQystdne, D=Aspartic Add, 
E"Glutamic Add, F=Phenyl alanine, G^Gtydne, H^HIsttdlne, 
I-Isoleadne» K=»Lysine, L=Leodne, M^Methtonlne, 
N«AsparagJne, P^Proline, Q-Glntamlne, R^Argfnine, S-Serine, 
T^Threonlne, V^Vallne, W-Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, A=possible nudeotlde deletion, 
\F=possfbie nodeotide insertion 










TWKSRCPISSCNKLFTSKHSMKTHMVKRHKV^ 

DLLAQLEAANSLTPSSELTSQRQNDLSDAEIVSLF 

SDVPDSTSAALLDTALVNSGILTIDVASVSSTLAG 

HLPANNNNSVGQAVDPPSLMATSDPPQSLDTSLF 

FGTAATGFQQSSLNMDEVSSVSVGPLGSLDSLA 

MKNSSPEPQALTPSSKLTVDTDTLTPSSTLCENSV 

SELLTPAKAEWSVHPNSDFFGQEGETQFGFPNAA 

GNHGSQKERNLITVTGSSFLV 


3645 


A 


2194 


1707 


TVSFHKTMASLKCSTVVCVICLEKPKYRCPACRV 

PYCSWCTRKHKEQCNPETRPVEKK^ 

VKPVENKDDDDSIADFLNSDEEEDRVSLQNLKN 

LGESATIJ^IXLNPHI^QLMVNLDQGEDKAKLM 

RAYMQEPLFVEFADCCLGIVEPSQNEES 


3646 


A 


85 


1948 


ERGGGKAAAAAAAAAAARALAASGQDPRPHPR 

APPWDDSGDDDEATTPADKSELHHTLKNLSLKL 

DDLSTCNDLIAKHGAALQRSLTELDGLKIPSESG 

EKIXVVNERATLrTOTSNAMINACRDFLEL^EIHS 

RKWQRALQYEQEQRVHLEETffiQLAKQHNSLER 

AFHSAPGRPANPSKSFIEG SLLTPKGEDSEEDEDT 

EYFDAMEDSTSFITVITEAKEDSRKAEGSTGTSSA 

DWSSADNVLDGASLWKGSSKVKRRVRIPNKPN 

YSLNLWSIMKNCIGRELSRIPMPWFNEPLSMLQ 

RLTEDLEYHHLLDKAVHCTSSVEQMCLVAAFSV 

SSYSTTVHRIAKPFNPMLGETFELDRLDDMGLRS 

LCEQVSHHPPSAAHYVFSKHGWSLWQEIHSSKF 

RGKYISIMPLGAIHLEFQASGNHYVWRKSTSTVH 

NIIVGKLWIDQSGDIEIVNHKTNDRCQLKFLPYSY 

FSKEAARKVTGWSDSQGKAHYVLSGSWDEQM 

ECSKVMHSSPSSPSSDGKQKTVYQTLSAKLLWK 

KYPLPENAENMYYFSELAL'H. NEHEEG VAPTDS 

RLRTDQRI A.'EKGR^nDEANTI ^ 3?J r:~K ?RL?I*: * 

RRRLEACGPGSSC83iiE i 


3647 


A 


46 


5007 


PTGDACV STSCELASALSHLDASHLTENLPKAAS : 

ELGQQPMTBLDSSSDLISSPGKKGAAHPDPSKTS 

VDTGQVSRPENPSQPASPRVTKCKARSPVRLPHE 

GSPSPGEKAAAPPDYSKTRSASETSTPHNTRRVA 

ALRGAGPGAEGMTPAGAVLPGDPLTSQEQRQGA 

PGNHSKALEMTGIHAPESSQEPSLLEGADSVSSR 

APQASLSMLPSTDNTKEACGHVSGHCCPGGSRE 

SPVTOIDSFIKELDASAARSPSSQTGDSGSQEGSA 

QGHPPAGAGGGSSCRAEPVPGGQTSSPRRAWAA 

GAPAYPQWASQPSVLJISIHPnKHFTVNKNFLSN 

YSRNFSSFHEDSTSLSGLGDSTEPSLSSMYGDAE 

DSSSDPESLTEAPRASARDGWSPPRSRVSLHKED 

PSESEEEQimCSTRGCPNPPSSPAHLPTQAAICPAS 

AKVLSLKYSTPRESVASPREKVACLPGSYTSGPD 

SSQPSSLLEMSSQEHBTHADISTSQNHRPSCAEET 

TEVTSASSAMENSPLSKVARHFHSPPIILSSPNMV 

NGLEHDLLDDETLNQYETS1NAAASLSSFSVDVP 

KNGESVLENLHISESQDLDDLLQKPKMIARRPIM 

AWFKEINKHNQGTHLRSKTEKEQPLMPARSPDS 

KIQMVSSSQKKGVTVPHSPPQPKTNLENKDLSKK 

SPAEMLLTNGQKAKCGPKLKRLSLKGKAKVNSE 

APAANAVKAGGTDHRKPUSPQTSHKTLSKAVS 

QRLHVADHEDPDRNTTAAPRSPQCVLESKPPLAT 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Aianine OCysteine, D=Aspartic Add, 
E=Glotamlc Add, ^Phenylalanine, OGtydne, H-Hlstidlnc, 
Msoleudne, K=Lysinc, L^Leudne, M=Methionine, 
N=*Asparagiiie, P-ProHne, Q=Glutamiiie, R-Argtnine, S=Serint, 
T«Threonine, V=VaIine, W«Tryptopha», Y-Tyroslne, 
X^Unknorrn, *=Stop codon, A=possible nndeoride deletion, 
V=possibIe nudeotide insertion 










sgplkpsvsdtsirtfvspltspkpvpeqgmwsrf 

hmavlsepdrgcpttpkspkcraegrapradsg 

pvspaasrngmsvagnrqseprlashvaadtaq 

prptgekggnimasdrlertnqlkiveisaeavse 

tvcgnkpaesdrrggclaqgncqekseirlyrq 

vaesstshpsslpshasqaeqemsrsfsmaklas 

sssslqtairkaeysqgksslmsdsrgvprnsipg 

gpsgedhlyftprpatrtysmpaqfsshfgregh 

pphslgrsrdsqvpvtsswpeakasrgglpsla 

ngqgiysvkplldtsrnlpatdegdhsvqetscl 

vtdkkvtorhycyeqnwphestsffsvkqriks 

fenlanadrpvaksgaspflsvsskppigrrssgs 

ivsgslghpgdaaarllrrslsscsenqseagtl 

ijpqmakspsimtltisrqnppetsskgsdselkks 

lgplgiptptmti^pvkrnkssvrhtqpspvsrs 

k1x5e1jiai^mpdijdk1x:sedysagpsavlfktel 

eitprrspgppaggvscpekggnracpggsgpkt 

saaetpssasdtgeaaqdlpfrrswsvnldqllv 

sagdqqrlqsvlssvgskstiltliqeakaqsene 

e6vcfivlnrkegsglgfsvaggtdvepksitvh 

rvfsqgaasqegtmnrgdfllsvngaslaglah 

gnvlkvlhqaqlhkdalvvikkgmdqprpsar 

qepptangkgllsrktiplepgigrsvavhdalc 

vevlktsaglglsldggkssvtgdgplvdcrvy 

kggaaeqagiieagdeilaingkplvglmhfda 

wnimksvpegpvqllirkhrnss 


3648 


A 


337 


1564 


KSRLSVTLMPVQLSEHPEWNESMHSLRISVGGLP 
VLASMTKAADPRFRPRWKVVLTFFVGAAILWLL 
CSHRPAPGRPPTHNAHNWRLGQAPANWYNDTY 
PLSPPQRTPAGIRYRIAVIADLDTESRAQEENTWF 
TYL - KGYU^™ ? IDKVAVEV/O^HGVLE^HL 
AEKGRGkC:- LSi>Li /TNGiCLYSVDDRTGVVYQlE 
GSKAWW\R.SIXjIXjTVEKGFKAEWLAVKDER 
LYVGGLGKE vvITTTGDWNENPEWVK WGYK 
GSVDHENWVSNYNaLRAAAGIQPPGYLIHESAC 

wsdtlqrwfflprrasqerysekdderkganll 
lsaspdfgdiavshvgavvpthgfssfkfipntdd 
qevalkseedsgrvasyimafildgrfllpetki 

GSVKYEGtEFI 


3649 


A 


1 


775 


ptrpgsgsaggarvgsgefgvemaalaplpplpa 

qfksiqhhlrtaqehdkrdpwayycrlyamq 

tgmkidsktpecrkfi^klmdqlealkkqlgdn 

eaitqeivgcahlenyalkmflyadnedragrf 

hknmiksfytasixidvitwgeltdebrvkhrky 

arwkaty1hnclkngeipqagpvgibedndieen 

edagaaslptqptqpsssstydpsnmpsgnytgi 

qippgahapantpaevphstgvak 


3650 


A 


20 


963 


KMAATLGPLGSWQQWRRCLSARDGSRRLLLLL 

LLGSGQGPQQVGAGQTFEYLKREHSLSKPYQGE 

APRPCFLRDWEUJVHFKIHGQGKKNLHGDGLAI 

WYTKDRMQPGPWGNMDKFVGLGVFVDTYPNE 

EKQQERVFPYISAMVNNGSLSYDHERDGRPTEL 

GGCTAIVRM.HYDmvmYVKRHLTIMMDIDGK 

HEWRDCBEVPGVRLPRGYYFGTSSITGDLSDNHD 

VISLKLFELTVERTPEEEKLHRDVFIJ^VDNMKL 
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NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A s Alanine OCysteine, D=As parti c Add, 
E^Glutamlc Add, ^Phenylalanine, OGIydne, H=Histidinc, 
I=IsoJeudne, K=Lysine, L=Leudne, M=Methionine, 
N=Asparagine, P=Proline, Q-Glntamint, R-Arginine, S=Scrine, 
T=Threonint, V=Valine, W»Tryptophan, Y=Tyro$ine, 
X c3 Unknown| * ei StDp codon, /^possible nndcotide ddetion, 
\=posslble nudeotide insertion 










PEMTAPLPPLSGLALFLIVFFSLVFSVFAIVIGIILY 
NKWQEQSRKRFY 


3651 


A 


1 


1218 


RSWAYVKKOCNNMCPNRGLrnXSPEPCWIJfflA 

AGTVSAVQARGLQPSQSRSRPRVPGLATALAYG 

PAHTPPLSRIGWAMQPPPPGPLGDCLRDWEDLQ 

QDFQNIQVSAAADAGSPPSRVSLAQGQGSGSPGC 

KPSLPAEAEGAAQELENQMKERQGLFFDMEAYL 

PKKNGLYI^LVIXjNVNVTLLSKQAKFAYKDEYE 

KFKLYLTIILILISFTCRFLLNSRVTDAAFNF'LLVW 

YY<m,TIRESILI^GSRIK^ 

VMLTWPIXjLMYQKFJRNQFLSFSMYQSFVQFLQ 

YYYQSGCLYRLRALGERHTMDLTVEGFQSWMW 

RVLTFLLPFLrTGHFWQIJFNALTLFNLAQDPQCK 

EWQVLMCGFPFLLLr^GNFrTIIJlWHHKFHS 

RHGSKKD 


3652 


A 


640 


164 


VTTSCIIPFAFGIX5VRASERLAEIDMPYLLKYQPM 

MQTIGQKYCMDPAVIAGVLSRKSPGDKILVNMG 

DRTSMVQDPGSQAPTSWISESQWQTTEVLTTRI 

TELQRRFPTWTPDQYLRGGLCAYSGGAGYVRSS 

QDLSCDFCNDVLARAKYLKRHGF 


3653 


A 


2 


909 


IVRRDWQEVSD1HLAMANCKMTKSIRFPALEHC 

YTGGEWLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENCIIVSMNTADPGSQG1THSLLLQVJDDKGSILPP 

NTEGNIGIR1KPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGYICFLGRSDDIENASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEWK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSBLPKTITGKIERKELRKKETGQM 


3654 

i . 


A 


2 


909 


IVRRDWOEVSDIHI^MANCKNrrKSIRFPALEHC 
YTGGEV\ ( v.'KTOEL VKPJTT':- ..YBNYoOSn-'C- 
LICATYWGMKIKPGFMGR/ .TPl- i ju- /QFHMEAS V 
ENCnVSMNTADPGSQGITHS T LLQVIDDKGSILPP 
NTEGNIGIRIKPVRPVSLFMCY EGDPEKTAKVEC 
GDFYNTGDRGKMDEEGYICFLGRSDE'IINASGYR 
IGPAEVESALVEHPAVAESAWGSPDPIRGEWK 
AFIVLTPQFLSHDKDQLTBvELQQHVKSVTAPYKY 
PRKVEFVSELPKTITGKIERKELRKKETGQM 


3655 


A 


2 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDTG 

MVAHINNSRLKAKGVGQHDNAQNFGNQSFEEL 

RAACLRKGELFEDPLFPAEPSSLGFKDLGPNSKN 

VQNISWQRPKDIINNPIJTMDGISPTO 

W1XAAIGSLTTCPKLLYRVVPRGQSFKKNYAGIF 

HFQIWQFGQWVNVVVDDRLPTKNDKLVFVHST 

ERSEFWSALLEKAYAKLSGSYEALSGGSTMEGL 

EDFTGGVAQSFQLQRPPQNLLRLLRKAVERSSL 

MGCSffiVTSDSELESMTDKMLVRGHAYSVTGLQ 

DVHYRGKMETLIRVRNPWGRIEWNGAWSDSAR 

EWEEVASDIQMQLLHKTEDGEFWMSYQDFLNN 

FTLLEICNLTPDTLSGDYKSYWHTTrTEGSWRTG 

SSAGGCRNHPGTFWTNPQFKISLPEGDDPEDDAE 

GNVWCTCLVALMQKNWRHARQQGAQLQTIGF 

VLYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEIF 

TNSREVSSQLRLPPGEYmPSTFEPHRDADFLLRV 

FTEKHSESWELDEVNYAEQLQEEKVSEDDMDQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alanine OCysteine, D=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, (XSlydne, H»ffistidine, 
I^Isoleudne, KpLysine, L^Lendne, M^MethlonJne, 
N-Asparagjne, P^TroIlne, Q^GIutamlne, R-Arginlne t S=Scrine, 
T-Tbrconine, V^Valine, ^Tryptophan, Y-Tyrosfne, 
X-Uoknomi, *=Stop codon, /possible nudeotide deletion, 
V=possible nudeotide Insertion 










DFLHLrTCIVAGEGKEIGVYELQRIXNRMAIKFKS 

rTCTKGFGUMCRONflh^ 

LWKKLKKWMDIFRECDQDHSGTLNSYEMRLVIE 

KAGIKLNNKVMQVLVARYADDDLIIDFDSFISCF 

LRLK1WIITLTMDPKNTGHICLSLEQVLGEGW 

EGICRIAPACPSTPPPPSSDVPGPASCPRLFPPWDL 

LPVSTVAADDHVGIEAL 


3656 


A 


3 


174 


PI/nHYLLPELPEKSSRTSPRSRPGNMLSGDPHLP 
QPLCHCLDHCPCCFSGKRLVA 


3657 


A 


1 


444 


DTRSTYHNAHSLPTYVKSPAPCQMTYDCSPAPCQ 

TQTCYVQGASPCQSYYVQAPASGSTSQYCVTDP 

CSAPCSTSYCCLAPRTFGVSPLRRWIQRPQNCNT 

GSSGCCENSGSSGCCGSGGCGCSCGCGSSGCCCL 

GHPMKSRSPALL 


3658 


A 


92 


1537 


SEAPVQPQPYTMTSFYSTSSCPLGCTMAPGARNV 

FVSPBDVGCQPVAEANAASMCLLANVAHANRVR 

VGSTPl^RPSLCIJPrn , SHTACPU > GTCrnPGNIGIC 

GAYGKhTILNGHEKBTNIKFL>©RLANYLEKVRQ 

LEQENAE1JETTLLERSKCHESTVCPDYQSYFRTIE 

ELQQKIIXSKAENARLrVQIDNAKLAADDFrUKL 

ESERSOIQLVEADKCGTQKLLDDATLAKADLEA 

QQESLKEEQLSLKSNHEQEVKILRSQLGEKFRIEL 

DIEPTTOLNRVLGEMRAQYEAMVETNHQDVEQ 

WFQAQSEGISLQAMSCSEELQCCQSEILELRCTV 

NALEVERQAQHTLKDCLQNSLCEAEDRYGTELA 

QMQSLISNLEEQLSEIRADLERQNQEYQVLLDVK 

ARLENEIATYRM.TPLQSLFHACLLYrXSKLWPC 

HRWVSLWPWSQHGEMILKARVRRLRLVALGSG 

VPSPCPVFLQD 


3659 


A 


2 


402 


DLLQCLNQLYSASTEMSCQOSQC^OPPPKCTP 

nciPKcn K:m:cppKci>PQYSAv ^fpyzsc ?g , 

SSSGGCCSSEGGGCCi . t liKRPRQSLRRRPQSSSC 
CGSGSGQQSGGSSCCHSSGGSGCCHSSGGCC 


3660 


A 


26 

- 


710 


CSAVEVKMAARTAFGAVCRRLWQGLGNFSVNT 

SKGOTAKNGGIXI^TNMKWVQFSNUTVnDWKD 

LTKPWTISDEPDILYKRLSVLVKGHDKAVLDSY 

EYFAVLAAKELGISKVHEPPRKIERFTLLXJSVHI 

YKKHRVQYEMRTLYRCLELEHLTGSTADVYLEY 

IQRNLPEGVAMEVTKFCFFIFIJ)TIRTVTRTHQGA 

NLGNTIRRKRRKQVIKPQGGHFCLNLK 


3661 


A 


2 


370 


DVSVAASEPTVYKNPTKMSCQQNQQQCQPPPKC 
PIPKYPPKCPSKCASSCPPPISSCCGSSSGGCCSSG 
GCGCCSSEGGGCO^HHRHHRSHCHRPKSSNCY 
GSGSGQQSGGSGCCSGGGCC 


3662 


A 


205 


1277 


RKSU»HPNPQKMLKKPLSAVTWLCIFIVAFVSHP 

AWLQKLSKHKTPAQPQLKAANCCEEVKELKAQ 

VANI^SIXSELNKKQERDWVSVVMQVMELESN 

SKRMESRLTDAESKYSEMNNQIDIMQLQAAQTV 

TQTSAGKETSPLRERGVPPHLQHCFYIPPDDFLGS 

PELEVFCDMETSGGGWTnQRRKSGLVSFYRDW 

KQ YKQGFG SIRGDFWLGNEHIHRLSRQPTRLRVE 

MED WEGNLRYAEYSHFVLGNELN S YRLFLGNY 

TGNVGNDALQYH>OTAFSTKDKDNDNCLDKCA 

QlJUCGGYWYNC(ni)S>ILNGVYYRLGEHN]aiLD 

GITWYGWEiGSTYSLKRVEMKIRPEDFKP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystdne, D=Asparttc Acid, 
E=Glutaraic Add, ^Phenylalanine, 0=Grydne, EHHistidine, 
Msoleudne, KpLysinc, LpLeudne, M^Methionine, 
N^Asparagiac, P^Proline, Q=<JIataminc, R=Arginine, S=6eriae, 
T«Threonine, V«Valine, W=Tryptophan, Y-Oyroslne, 
X=Unknown, *=Stop codon,/=possiblc nucleotide deletion, 
\ppossible nudeottde insertion 


3663 


A 


64 


1456 


l^AKETLAQMYNTVWNMEDLDLEYAKTDINC 

GTDLMFYIEMDPPALPPKPPKPTTVANNGMN^ 

MSLQDAEWYWGDISREEVNEKLRDTADGTFLV 

RDASTBCMHGDYTLTLRKGGNNKLIKIFHRDGKY 

GFSDPLTFSSVVELINHYRNESLAQYNPKLDVKL 

LYPVSKYQQDQVVKEDNIEAVGKKLHEyNTQFQ 

EKSREYDRLYEEYTRTSQEIQMKRTAffiAFNETIK 

IFEEQCQTQERYSKEYIEKFKREGNEKEIQRJMHN 

YDKLKSRISEIIDSRRRLEEDLKKQAAEYREIDKR 

MNSIKPDLIQIJIKTRIXJYLMWLTQKGVRQKKL 

NEWLGNENTEDQYSLVEDDEDLPHHDEKTWNV 

GSSNKNKAEhUXRGKRIX5TFLVRESSKQGCYAC 

SVVVDGEVKHCVINKTATGYGFAEPYNLYSSLK 

ELVLHYQHTSLVQHNDSLNVTLAYPVYAQQRR 


3664 


A 


944 


406 


GATVEI>QSCNFGSLRWVVSVPHISARSCPDPLLS 
RTGRVPGGRGAGLPRHHSPRCCLQVFFNGANVR 
QVDVFILTGAFGD^AAHVPTLQVLRPGLVVVHA 
EDGTTSKYFVSSGSIAVNADSSVQLLAEEAVTLD 
MLDLGAAKANLEKAQAELVGTADEATRAEIQIR 
IEANEALVKALE 


3665 

7 

1 

I 


A 


98 


1388 


ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSFHEH 

RHQSGRCLSTCMAPNIJCGRPRKKKPCPQRRDSF 

SGVKDSNNNSDGKAVAKVKCEARSALTKPKNN 

HNCKKVSNEEKPKVAIGEECRADEQAFLVALYK 

YMKERKTPIERIPYLGFKQINLWTMFQAAQKLG 

GYETITARRQWKHIYDELGGNPGSTSAATCTRR 

HYERLDLPYERFIKGEEDKPLPPIKPRKQENSSQE 

NENKTKVSGTKRIKHEIPKSKKEKENAPKPQDAA 

EVSSEQEKEQETLISQKSIPEPLPAADMKKKIEGY 

QEFSAKPLASRVDPEKDNETDQGSNSEKVAEEA 

GEKGF " : irl .PLjlPE'in " ?7"GASKCPL i ~rD 

ALVDSKQESKLCCFTESI 5: SH?Qii VSFPRLPHH T G 

HRWQTRMRRRMTNCPPWQ.:rLPTAP 




A 


113 


1492 


LLQEMCTKTIPVLWGCFIXWi^>7^ 

KARTIX)RAIJ>YGVQAGMKMIEQ^XEKKLPDL 

SGSESIJBFLKVDYV>mn^KISAFSFPNTSlJ^ 

VPGVGDCALTNHGTANISTDWGFESPLFVLYNSF 

AEPMEKPILKhn^NEMLCPIIASEVKALNANLSTLE 

\T-TKIDNYTLIJDYSLISSPH 

PIJENLTDPPFSPVPFVLPERSNSMLY1GIAEYFFKS 

ASFAHFTAGVFNVTLSTEEISNHFVQNSQGLGNV 

LSRIAEIYII^QPFMVRIMATEPPnNLQPGNFTI^ 

PASIMMLTQPKNSTVETIVSMDFVASTSVGLVIL 

GQRLVCS1JSLNRFRIALPESNRSNIEV1JRFEN1LSS 

IUJFGVIJIANAKIXJQGFPLPNPHKFLFVNSDIEV 

LEGFLLISTDLKYETSSKQQPSFHVWEGLNLISRQ 

WRGKSAP 


3667 


A 


1 


181 


FRGRLGSGRNGGGSMNAPPAFESFLLFEGEKITIN 
KDTKVPNACLFTINKEDHTLGNIIK 


3668 


A 


212 


431 


VAGEAVPFFPMMYSEPLKPSYLALVLWYFLLTG 
YCITKPEVIFKffiQGEEPWILEKGFPSQCHPAKYL 
WCLHD 


3669 


A 


458 


1056 


FSGVCFAGIAGSMATLLHDAVMNPAEVVKQRLQ 
MYNSQHRSAISCIRTVWRTEGLGAFYRSYTTQLT 
MNIPFQSIHFITYEFLQEQVNPHRTYNPQSHnSGG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E=Glatamic Add, ^Phenylalanine, G-Glydnc, H=Histidlne, 
f=Isoleudne» K=Lysine, L^Leudne, M=Methionine, 
N-Asparagine, P^Prollne, Q=Glutamlne, R-Arginine, S^Serine, 
T-Threoninc, V°VaHne, W=Tryptophan, Y«Tyrosine, 
X c3 Unknown, *=Stop codon, ^possible nudeotide deletion, 
^possible nudeotide insertion 










I^GALAAAATTPLDVCKTU^TQENVALSLANIS 
GRLSGMANAFRTVYQLNGLAGYFKGIQARVIYQ 
MPSTAISWSVYEFFKYFLTKRQLENRAPY 


3670 


A 


145 


298 


RNPCPLTFLPSTLN1VLLLSLTFFSALTFHSICQLRN 
TGVEVDIVFQRVSFL 


3671 


A 


3 


462 


ILKVAKKERTMSSLPVPYKLPVSLSVGSCVIIKGT 

PIHSFINDPQLQVDFYTDMDEDSDIAFRFRVHFG 

NHVVMNRREFGrWMLEBTTDYWFEDGKQFBLC 

rVVHYNEYEIKVNGHTHLRAI^HRIPPSFVEDGC 

KCPRRYLPWTSVCVCN 


3672 


A 


1 


1028 


HYAKLGTRPRLKFMSSPSLSDLGKREPAAAADE 

RGTQQRRACANATWNSIHNGVIAVFQRKGLPDQ 

ELFSLNEGVRQLLKTELGSFFTEYLQNQLLTKGM 

VILRDKIRFYEGQKLLDSLAETWDFITSDVLPML 

QAIFYPVQGKEPSVRQLALLHFRNATTLSVKLED 

AlJVRAHARWPAIVQMLLVl^VHESRGVTEDY 

LRLETLVQKWSPYLGTyGLHSSEGPFTHSCILEK 

RLLRRSRSGDVLAJK^VVRSKSYNTPLLNPVQE 

HEAEGAAAGGTSIRRHSVSEMTSCPEPQGFSDPP 

GQGPTGTFRSSPAPHSGPCPSRLYPTTQPPEQGLD 

PTRS 


3673 


A 


2 


712 


RPPR V WYPELREL S AAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 

WFHVDKLSSAHVYLRLHKGEMEDIPraVLMDC 

AHLVKANSIC^CKMbfNVNVVYTPWSNLKKTAD 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREEKNEKKAQIQEMKKR 

EKEEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 


3674 


A 




712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTTYMGKlyKYLiJBDLfFvT-WPEDI 

WFHVDKI^SAHVYLRLHKGENIEDi. ; 2 vLMDC 

AHLVKANSIQGCKMNNVNVVYTPWSNLKKTAD 

MDVGQIGI1IRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EKEEMKKKREMDELRSYSSLN1KVENMSSNQDG 

NDSDEFM 


3675 


A 


921 


1321 


VTlJtfyvIRVmSSCLKVQEQMAN 
QPIPSNIPNRSTFACPYCGARNLDQQELVKHCVE 
SHRSDFNRWCPICSAMPWGDPSYKSANFLQHL 
LHRHKFSYDTFVDYSIDEEAAFQAALALSLSEN 


3676 


A 


3 


1856 


TLGRWLIXjVYETVAPTLACIJRPRIJIRRRRRRR 

RRMISRYTRKAVPQSlJBLKGrrKHALNHHPPPEK 

LEEISPTSDSHEKDTSSQSKSD1TRESSFTSADTGN 

SLSAFPSYTGAGISTEGSSDFSWGYGELDQNATE 

KVQTMFTAIDELLYEQKLSVHTKSLQEECQQWT 

ASFPHLRILGRQETPSEGYRLYPRSPSAVSASYET 

TLSQERDSTIFGIRGKKLHFSSSYAHKASSIAKSSS 

FCSMERDEEDS1IVSEGIIEEYLAFDHIDIEEGFHG 

KKSEAATEKQKLGYPPIAPFYCMKEDVLAYVFD 

SVWCKWSCMEQLTRSHWEGFASDDESNVAVT 

RPDSESSCVLSELHPLVLPRVPQSKVLYTTSNPMS 

LCQASRHQPNVNDLLVHGMPLQPRNLSLMDKLL 

DLDDKLLMRPGSSTILSTRNWPNRAVEFSTSSLS 

YTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEE1L 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Ataoine OCystdne, D=Aspartk Add, 
E^GIntamic Add, ^Phenylalanine, G^Glydne, H^Hlstidine, 
Msoleadne, K^Lysine, L^Lendne, M=Methiojaine,' 
N»Asparagine, P=Proline, Q=Glutamine, R=Arginlne, S^Serine, 
Threonine, V=Valine, W«Tryptophan, Y^Tyrosine, 
X«Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possibIe nudeotide insertion 










RGARWVAPDSI^SPSPTPI^RNNLLPPIGTAEVE 

HVSTVGPQRQMKPHGDSSRAQSAWDEPNYQQ 

PQERLUPDFFPRPNTIXJSFLLDTQYRRSCAVEYP 

HQARPGRGSAGPQLHGS1XSQSGGRPVSRTRQG 

P 


3677 


A 


246 


757 


MRLQGAIFVLLPHLGP1LVWLFTRDHMSGWCEG 

PRMLSWCPFYKVLLLVQTAIYSWGYASYLVWK 

DLGGGLGWPLALPLGLYAVQLTISWTVLVLFFT 

VHNPGlJUiHIXLLYGLWSTA^ 

LLLPYLAWLTVTSALTYHLWRDSLCPVHQPQPT 

EKSD 


3678 


A 


20 


1508 


RGKAEFFLAMAGTNAIXMLEOTIIXjKFLPCSSY^ 

DSYDPSTGEVYCRVPNSGKDEIEAAVKAAREAFP 

SWSSRSPQERSRVLNQVADLLEQSLEEFAQAESK 

DQGKTIJVIARTMDIPRSVQNFRFFASSSLHHTSE 

CIXJMDHLGCMirmT^VGVAGLISPWNLPLY 

LLTWKIAPAMAAGNTVIAKPSELTSVTAWMLCK 

LLDKAGVPPGWNIVFGTGPRVGEALVSHPEVPL 

ISFTGSQPTAERITQLSAPHCKFCLSLELGGKNPA1I 

FEDANLDECIPATVRSSFANQGEICLCTSRIFVQK 

SIYSEFLKRPVEATRKWKVGIPSDPLVSIGALISK 

AHLEKVRSYVKRALAEGAQIWCGEGVDKLSLPA 

RNQAGYFMLPTVITDIKDESCCMTEEIFGPVTCV 

WFDSEEEVffiRANNVKYGLAATVWSSNVGRVH 

RVAKKLQSGLVWTNCWLIRELNLPFGGMKSSGI 

GREGAKDSYDFFTEDCTITVKH 


3679 


A 


1862 


502 


MAGTKPYMEIQTTIREYYEHLYANKLENLEEMD 

KFLDTYTLPRLNQEEVESLNRPITGSEIEAIINSLP 

TKKIPGPDRFTAKFY QRYKEELSNLIHYLGLSHH 

LLALNFIIVSFGKKSAWSSAQVKVTDTDFDGVEV 

^VFEG?rT^PIJCR^VV^^ 

YDELCTA^lAii, -iLNAViVSIEYRLVPKVYFPEQlH 

DWRAr^IYFLKPEVLQKYMVDPGRICISGDSAG 

GNLAAALCQQFTQDASLKNKLKl^ALIYPVLQA 

lJ>F^SYQQ^/rm>ILPRYVMVKYWVDYFKG 

NYDFVQAMIVNNHTSIJDVEEAAAA^RARLNWTS 

iXPASFIKNYKPVVQTTGNARIVQELPQLLDARS 

APLIAIXJAVLQLIJ^KTYILTCEHDVLRDDGIMYA 

KRLESAGVEVTIJDHFEDGFHGCMIFTSWPTNFSV 

GIRTRNSYIKWLDQNL 


3680 


A 


249 


2146 


RSWGAPWFWRMRLIJlE^RHMPIJtLAMVGCAFV 

LJFLFLLHRDVSSREEATEKPWLKSLVSRKDHVLD 

LMLEAMNNLRDSMPKIXJIRAPEAQQTLFSINQSC 

LPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 

SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSL 

GPDTRPPECVIXJKFRRCPPLATTSVIIVFHNEAWS 

TLLRTVYSVLHTTPAILLKEIILVDDASTEEHLKE 

KLEQYVKQLQWRWRQEERKGLITARLLGASV 

AQAEVLTFU)AHCECFHGWIJEPLLARIAEDKTV 

VVSPDIVTIDLNTFEFAKPVQRGRVHSRGNFDWS 

LTFGWETLPPHEKQRRKDETYPIKSPTFAGGLFSI 

SKSYFEfflGTYDNQMEIWGGENVEMSFRVWQC 

GGQLEIIPCSWGHVFRTKSPHTFPKGTSVIARNQ 

VRLAEVWMDSYKKIFYRRNLQAAKMAQEKSFG 

DISERLQLREQIJICHNKWYLHNVYPEMFVPDL 
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SEQKD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystelne, D=Aspartic Add, 
&=dutamJc Add, ^Phenylalanine, G^Glydne, EHBlstidine, 
I-XsoJendne, KNLysIne, L=Leudne, M=Methioijine, 
N»Asparagtue, P=ProlIne, Q=<Jintamine, R»ArgUune, S^Serine, 
^Threonine, V«Valine, W«Tryptophan, Y«Tyrosine, 
X^Unknown, *=Stop codon,/=possiblc nudeotide deletion, 
V=possible nudeotide Insertion 










TPTr^GAIKNLGTNQCLDVGEh^GGK^IMYS 
CHGUKJNQYFEYTTQRDLRHNIAKQLCLHVSKG 
ALG1X5SCHFTGKNSQVPKDEEWELAQDQLIRNS 
GSGTCLTSQDKKPAMAPCNPSDPHQLWLFV 


3681 


A 


2982 


1869 


LKDTLKSQMTQEASDEAEDMKEAMNRMIDELN 

KQVSELSQLYKEAQAELEDYRKRKSLEDVTAEY 

IHKAEHEKLMQLTNVSRAKAEDALSEMKSQYSK 

VLNELTQLKQLVDAQKENSVSITEHLQVITTLRT 

AAKEMEEKISN1JCEH1JVSKEVEVAKLEKQLLEE 

KAAMTDAMVPRSSYEKLQSSLESEVSVLASKLK 

ESVKEKEKVHSEVVQIRSEVSQVKREKENIQTLL 

KSKEQEVNELLQKFQQAQEELAEMKRYSESSSK 

I^EDKDKKINEMSKEVTKIJKEALNSLSQLSYSTS 

SSKRQSQQLEALQQQVKQLQNQLAECKKQHQE 

\aSVYRMHLLYAVQGQMDEDVQKVLKQILTMC 

KNQSQKK 


3682 


A 


447 


1024 


AQALTAGRQ1JUJVAPFIAPISPISLPRLNPPSQSW 

NSTPFFKVKLPPQKEV1TSDELMAHLGNCLLSIKP 

QEKSEGLQLOTQQNVDDAMTVLPKLATGLDVN 

VRFTGVSDFEYTPECSVFDLLGIPLYHGWLVDPQ 

QSPEAVRAVGKLSYNQiyVGEDHHLQTLQ*HQP 

RDRKPDCRAVPGDHRGPSDLPRTV 


3683 


A 


2 


942 


IJEKQEEKFVGQC1KEELMHGECVKEEKDFLKKE 

IWDTKVKEEPPINHPVGCKRKLAMSRCETCGTE 

EAKYRCPRCMRYSCSLPCVKKHKAELTCNGVRD 

KTAYISIQQFTEMNLLSDYRFLEDVARTADHISR 

DAI^KRPIS]^YMYFMK>OlARRQGINLKLLPNG 

FTKRKENSTFFDKKKQQFCWHVKLQFPQSQAVST 

♦KKRVPDDKTTNEILKPYIDPEKSDPVIRQRLKAYI 

RSQTGVQILMKIEYMQQNLVRYYELDPYKSI 7X> 

STKNVGNEN 


3684 


A 


119 


153; 


SLQENVQEKRVRVCPGLGGLLPNGTPSITAAAAP 

QVLWRHVQPGCSHHLHACVIRAACRAGEGHAD 

RHAGPPET/PVTLPSSWPWSSPWERQCPMH\L*AP 

GHAFRPVFTEHRRGWAALGHHRAAAGPLREPAS 

GSQPAPASC*PECHHGCPEQTRQCQDLLREAW 

APEQRG*PCAHLQT*ATATTLCPQVPAGRVWQP 

GHSCHLLPHRHIXjSH*HHCAAHRRPVTRRQAAH 

GVPLPDACYSPHHTLPAAPPPATRPAGHTATHPE 

♦GGDLTPVPDGPHDCPRDVQGIPGAGGGSQLAPC 

CPPFPAAPVSVQGTQGLGPKNVLH*QWEGERWQ 

KEPE^GPPPEVELKRGAKCRIGDHGLGAVLGQG 

EYAS*SPSIPW*ASSSACPPOnaPyTVYTQSPAAA 

PGWTRPPSP/PPPGLYPGP/PASHAPGVRGGISHQL 

YSLP*IX^ECCSCP/PPPPAHGGRCPSLLPPBALAK 

LLL 


3685 


A 


101 


438 


AWVLQCKINTELQTEVVMLKSMVLWLGEQVQS 
LQLQQQLHCHFbraTfflCVTNLEYNXKEYPWDLV 
KAHLQGASTSNITFDIGELQKKVILDLNKQTQEFQ 
PSL*AWTEFQQGLE 


3686 


A 


105 


845 


VSDWKNQLVEVQCRQDGCDAVENVHQMFMF 
NWFITXXWTLPLSNYQPSVESSSPGGSATSDDHE 
FDPS ADML VHDFDDERTLEEEEMMEGETNFS SEI 
EDLAREGDMPIHELLSLYGYGSTVRLPEEDEEEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A B AIaninc OCysteine, D=Aspartic Add, 
E=Giutamic Add, F=Phenyialanine, G=Crycine, H^Hbtidine, 
Msoleudne, K=Lysinc, L=Leudne, ^Methionine, 
N«Asparaglne, P=Proline, Q=GIutamlne, R-Arginine, S=Serine, 
T~Tbreonine, V=Valinc, W^Tryptophan, Y-Tyrosine, 
X^XJnknown, *«Stop codon, /^possible nudeotide deletion, 
^possible nucleotide insertion 










EEEEEGEDDEDADNDDNS GCSGENKEENIKDSS 
GQEDETQSSNDDPSQSVASQDAQEDRPRRCKYF 
DTNSEVEEESEEDEDYIP/SnSFFQSSDGI*SSSSSE 
DWKKEIMVGS 


3687 


A 


49 


1225 


PVLVTSLRMREADTLRPPQLMEVSADnSTVEFN 

HTGELLATGDKGGRWIFQREPESKNAPHSQGE 

YDVYSTFQSHEPEFDYLKSI^IEEKINKIKWLPQQ 

NAAHSLLST^KTKLWKITERDKRPEGYNLKDE 

EGKIJKDLSTVTSLQVPVIJC^^ 

NGHTYHINSISVNSIXIETYMSADDLRJNLWHLAI 

TDRSrTP\NIVDIKPANMEDLTEVITASEF^ 

l^FVYSSSKGSIJaCDMRAAALCDKHSKLFEEPE 

DPSNRSFFSEnS\SVSDVKFSHSDRYMLTR\DYLT 

VKVWDLNMEARPIETYQVHDYLRSKLCSLYEND 

CnTDKFECAWGSDR/IIMTGAYNNFFRMFDROT 

KRDVTLEASRGSSKPRAVL 


3688 


A 


1 


40) 


KKVPGRLSEMSFSLNFILPANTTSSPVT\DCGPSL 
GLAAGIPLLVATAIXVALLJTLIHRRRSSIEAMEE 
SDRPCEISEIDDNPKISENPRRSPTHEKNTMGAQE 
AHIYVKTVAGSEEPVHDRYRPTIEMERRR 


3689 


A 


698 


889 


GRVLVHCAMGVSRSATLVLAFIM1YENMTLVEA 
IPDGAGPPQISALTQAFVRQLQVLDNRLGRE 


3690 


A 


61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3691 


A 


61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3692 


A 


3 


2831 


PL VRRLLRQTLRRVGGARA VREA VMRA VLTWR 

DKAEHCINDIAFKPDGTQLILAAGSRLLVYDTSD 

GTLLQPLKGHKDTVYCVA YAKDGKRFASG SAD 

KSVUWTSKLEGILKYTHNDAIQCVSYNPITHQLA 

SCSSSDFGLWSPEQKSVSKHKSSSKnCCSWTNDG 

QYlALGMF^GnSIRNKNGEEKVKIERPGGS7,SPI 

u'sicwnpss;, : P *£srw^^E - -.- ^ - /ivnr - iq 

HlVSiLKSAVYSSQGSEAEEEE} :^buDOPRDDNL 

EERNDILAVADWG\QKVSFYQL^ JKQIGKDRAL 

NFDPCCISYFTKGEYELLGGSDKQ V3L vTKDG VR 

LGTVGEQNSWVWTGQAKPDSNYVVGGCQDGTI 

SFYQLIFSWHGLYKDRYAYRDSMTDVIVQHLIT 

EQKVRKClCELVKKIArfRNRI^ 

SEDLSDMHYRVKEKIIKKFEChn^ 

QEKRLQCLSFSGVKEREWQMESLIRYIKVIGGPP 

GREGLLVGIJa^GQILKIFVDNLFAIVLLKQATAV 

RCUDMSASRK10LAVVDENDTCLVYDn)TKELLF 

QEPNANSVAWNTQCEDMLCFSGGGYLNIKASTF 

PVHRQKLQGI^GYNGSKIFCLHVFSISAVEVPQ 

SAPMY Q YLDRKLFKEA YQIACLGVTDTD WRELA 

MEALEGLDFETAKKERKKRGETNNDLFLADVFS 

YQGKFHEAAKLYKRSGHENIjVLEMYTDLCMFE 

yakdflgsgdpketkmlitkqadwarnikepka 
avemyisagehvkaieicgdhgwvdmlidiark 
ij)kaerepllu:atyijckldspgyaaetylkmg 
dlkslvqlhvetqrwdeafalgekhpefkddiy 

MPYAQWLAENDRFEEAQKAFHKAGRQREAVQV 

LEQLTNNAVAESRFNDAAYYYWMLSMQCLDIA 

QDPAQKD 


3693 


A 


3 


1099 


SSFPTCMRTVFHSNTSVSSIXHRPGHVTPQLTIHG 
GWRHHRDHTAIDEWDFNPSKFLIYTCLLLFSVLL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne OCystdne, D=Asparn*c Add, 
E=Glutamic Add, ^Phenylalanine, Glycine, BNEUstidlne, 
Msoleudne, K=Lysine, L^Leuctae, IVfc»Methlonine, 
N«Asparagine, P=Proiine, Q=Glutamine, R^Arginine, S=Serine, 
T-Threonine, V«Valine, W-Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /^possible nudeotide deletion, 
V=possibIe nndeotlde insertion 










PLRLDGnQWSYWAVFAPIWLWKLLWAGASVG 

AGVWARNPRYRTEGEACVErlwAMLIAVGIHLLL 

LMFEVLVCDRVERGTHFWLLVFM^ 

AA<^WGFRHDRSLELEILCSVNILQFn?IAliXDRI 

IHWPWLVVFVPLWIOdSrXCLVVLYyiVWS^^ 

RSLDVVAEQRRTHVTMAISMTIVVPLLTFEVLL 

VHRLDGHhTITSYVSIFVPLWLSIXTLMATTFRRK 

GGNHWWFAIRRDF/CQDQLPQPTGKPPPPPLTDH 

HGEKALPLQNKDRGSWPASRGSPRLL 


3694 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3695 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPD1HE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3696 


A 


456 


733 


I^AALWEEPE^LWSETKELTh^GKMNYPQIGPH 
RPHVKGLRVRPGPGTLSNAPKSLCPGMSNSDRGI 
HVGGEGQGPGKRAGHLGRGGGMSFL 


3697 


A 


877 


1873 


VWL*T1^*HTCALMTVCRSCLVKYLEENNTCPT 
CRWIHQSHPLQYIGHDRTMQDIVYKLVPGLQEA 
EMRKQREFYHKLGMEVPGDIKGETCSAKQHLDS 
HKNGETKADDSSNKEAAE 


3698 


A 


1 


572 


KQCGIPHEVVRDENSSVYAEVSRLLLATGHWKR 

LRRDNPRFNLMLGERNRLPFGRLGHEPGLVQLV 

NYYRGADKLCRKASLVKLIKTSPELAESCTWFPE 

SYVIYPTNLKTPVAPAQNGIQPPISNSRTDEREFFL 

ASYNRKKEDGEGNVWIAKSSAGAKVWVQW*M 

TDLEEEIDIPSPVGLGLESEWPL 


3699 


A 

.• 

■"V 

'.4 


2008 

- 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKVEE 
HHLQPVQVLQTLLFSATA fTGCRRPARPPPAPPT 

PTPWRSAQSCV OSERAS ^ .GAC :iYULGA - a. Jl. : 
GGRALGGSK v i VPLPGKiLFSGCKHRRRRi ^ SD | 
AAPGEEAGT | 


3700 


A 


3? 


1318 


GYQIGMALASGPARRALAGSGQLGLGGFGAPRK 

GAYEWGVRSTRKSEPPPLDRVYEIPGLEPITFAG 

KMHFVTPAVI^RPIFPPWDRGYKDPRFYRSPPLHE 

HPLYKIXJACYIFHHRCRLLEGVKQALWLTKTKL 

IEGLPEKVLSLVDDPRNHIENQDECVLNVISHARL 

WQTTE^KRETYCPVIVDNLIQLCKSQILKHPSL 

ARRICVQNSTFSATWNRESLLLQVRGSGGARLST 

KDPIJ>mSREEIEATKNHVLJm^ 

IYDVKNDTGFQEGYPYPYPHTLYLLDKANLRPH 

RLQPDQLRAKMILFAFGSALAQARLLYGNDAKV 

LEQPWVQSVGTIXjR\rrTIFLWQL>riTDLDS^ 

GVKl^WVDSDQIXYQHrWCIJPVIKKRV^ 

VGPVGFKPETFRKFLALYLHGAA 


3701 


A 


86 


465 


WTLCGPEAGMVGYDPKPDGRNNTKFQVAVAGS 
VSGLVTRALISPFDVIKIRFQLQHERLSRSDPSAK 
YHGILQASRQILQEEGPTAFWKGHVPAQILS1GY 
GAVQFLSFEMLTELVHRGSVYDARE 


3702 


A 


166 


814 


GFWEKTNQSSHSMDPLGAPSQFVDVDTLPSWGD 
SCQDELNSSDTTAEIFQEDTVRSPFLYNKDVNGK 
VVLWKGDVALLNCTAIVNTSNESLTDKNPVSESI 
FMLAGPDLKEDLQKLKGCRTGEAQLTKGFNLAA 
RFmnVGPKYKSRYRTAAESSLYSCYRNVLQLA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

tn firei nmfno 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A= Alanine OCystdne, D»Aspartic Add, 
E^Glutamic Add, ^Phenylalanine, OGlyclne, H~Hbtidine, 
I=Isoleudne, K=Lysint, LriLeucine, M=Methionine, 
N=Asparagine, P=*roline, Q=Giutamine, R°Arginine, S= 3 Serine, 
Threonine, V=Valine, W^Tryptopban, Y=TyrosIne, 

X^ffnluiown *=Stnn r-nrfnn Aanntalhl^ nnrlMHHo *t*I»tfnn 

UUIUJVITH| J# f~~|/U33ILFIC UUUCUUUC VlCICllUHf 

V=possible nodeotide insertion 










KEQSMSSVGFCVINSAKRGYPIJCDATHIAIJITVR 
RFLEfflGETIEKW 


3703 


A 


128 


1255 


SLGPSPKSATIPCCGDTMAPEEDAGGEALGGSFW 

EAGNYRRTVQRVEDGHRLCGDLVSCFQERARIE 

KAYAQQLADWARKWRGTVEKGPQYGTLEKAW 

HAFFTAAERLSALHLEVREKLQGQDSERVRAWQ 

RGAFHRPVLGGFRESRAAEIXjFRKAQKPWLKRL 

KEVEASKKSYHAARKDEKTAQTRESHAKADSA 

VSQEQLRKLQERVERCAKEAEKTKAQYEQTLAE 

LHRYTPRYMEDMEQAFETCQAAERQRLLFFKD 

MLLTLHQHLDLSSSEKFHELHRDLHQGIEAASDE 

EDLRWWRSTHGPGMAMNWPQFEEWSLDTQRTI 

SRKEKGGRSPDEVTLTSIVPTRDGTAPPPQSPGSP 

GTGQDEEWSDEESP 


3704 


A 


1 


271 


ARGEDLALATGGGPDTVTHSNMPCPNSLVYDC 

WLNIKEC^VGEHTFEDLGLCPGRNQREKKRSYK 

DFLREEEKIAAQVRNSSKKKLKDSE 


3705 


A 


170 


1318 


LNWANLVIMWPREEEKEKVQDYSLGGLSPDLRI 

DVSRKKKILKAYDEDEDEDLYPDIHPPPSLPLPG 

QFTCrX^RKSFTORSFRPNUJIANMVQIIRQMCP 

TPYRGNRSNDQGMCFKHQEALKLFCEVDKEAIC 

WCRESRSHKQHSVLPLEEWQEYKAKLQGHVE 

PLRKHLEAVQKMKAKEERRVTELKSQMKSELA 

AVASEFGRLTRFLAEEQAGLERRLREMHEAQLG 

RAGAAASRLAEQAAQLSRLLAEAQERSQQGGLR 

LLQDIKETFNRCEEVQLQPPEVWSPDPCQPHSHD 

FLTDAIVRKMSRMFCQAARVDLTLDPDTAHPAL 

MLSPDRRGVRLAERRQEVADHPKRFSADCCVLG 

AQGFRSGRHYWEVCMGP 


3706 


A 


204 . 


1996 

• 

■ ■-■ - • 


SRERQTTWMDHNFAPAPPEMQSHGAPGPGTSFS 
. r THVLGrJ?IIi? r, ? T : PGGGSFLTPVT^JCTIHI^TFP 
Qt;IlIPQl^SRLGLGARTRSVPPQFiGIAL ^ 3LSP 
LPTSSLVPRKLSSISLTLHQNSQARSLDRPLSHWE 
El PTPGKKAAPHEGGRVSSPGSPPVTLVPGGRVH 
Sii^FGNPGLTKSNRMLATEKPLVSSYIALPFQSR 
IAQSAPVLAEPGSLGQGHLVSVTDHN1PTRASPG 
KGKPRARGIPRPRGRLQRANTTVNLTAMDTRTD 
AARH1j\TMATNRPSIAIN1ATPNTSQ1^ 
ALDIKLGTARDLSSVGTVKSGKTVNLATAGTIKP 
GTAMNLITVGTTKPGMVMDLJASEPDKLGKAM 
ATRSTAKPDMTTEGIAMDSATSDPVKPDTITATV 
GTSRLETAMAIARVNRAKIX3TAKNSLALDTSR 
MGTAVGSWPVTPDPATGKTTLGSVNNLTISDV 
ATCLLMPSRSTD1AIJDNTOAAMDRATEPASLDL 
ATEYKGKCRNLVGDGLGCREGEVCELGDGSMK 
PMSINSNLLGYIGIDTIIEQMRKKTMKTGrTDFNIM 
WGTEGCGAAAGLVAGSTKDPISFPQ 


3707 


A 


3 


549 


SSS1SRDFLGQAACASGTMLRWLRDFVLPTAACQ 

DAEQPMRYETLFQALDRNGDGWDIGELQEGLR 

>nirGIPLGQDAEEKIFTTODVNKIX3KLDFEEFMKY 

LKDHEKKMKLAFKSLDKJWIX^ 

TLGLTISEQQAELHXJSIDVDGTMTVDWNEWRD 

YFLFNFVTD1EE11R 


3708 


A 


1 


1866 


EFRGAGRANMIAPRGAA\QJLXHLVLQRWLAAG 
AQATPQVFDLLPSSSQRLNPGALLPVLTDPALND 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence . 


Amino add sequence (A^AIanine C=Cysteine, D^Aspartie Add, 
E=GIutemic Add, F=Phenylalanine, G=€rydne, H=HlstJdlne, 
I=Isoleudne, K=Lysine, I^Lendne, M=Methlonine, 
N=Asparagine, P=Proline, Q^Gintamine, R^Arginine, S=Serine 1 
T<^Threonine, V»Valine» W*Tryptophan, Y-Tyrosioe, 
X=Un known, *=Stop codon, /=possible nudeotide deletion, 
V=possiWe nucleotide insertion 










LYVISTFKLQTKSSATIFGLYSSTDNSKYFEFTVM 

GRI^KAILRYUCMXjKA^HLVVFNNLQIADGRRH 

RILIJU,SNIXJRGAGSl^YLDCIQVDSVHNLPRA 

FAGPSQKPET1ELRTFQRKPQDFLEELKLVVRGSL 

FQVASLQDCFLQQSEPLAATGTGDFNRQFLGQM 

TQLNQLLGEVKDLLRQEVNETSFLRNTITECQAC 

GPLKFQSPTPSTWPPASPAPPTRPPRRCPSNPCF 

RGVQCTDSRDGFQCGPCPEGYTGNGITCIDVDEC 

KYHPCYPGEHCINLSPGFRCDACPVGFTGPMVQ 

GVGISFAKSNKQVCTDIDECRNGACVPNSICVNT 

LGSYRCGPCKPGYTGDQIRGCKAERNCRNPELN 

PCSVNAQCIEERQGDVTCVCGVGWAGDGYICGK 

DVDIDSYPDEELPCSARNCKKDNCKYVPNSGQE 

DADRDGIGDACDEDADGDGDLNEQDNCVUHNV 

DQRNSDKDBPGDACDNCLSVLNNDQKDTDGDG 

RGDACDDDMDGDGIKNILDNCPKFPNRDQRDK 

DGDGVGDACDSCPDVSNPNQ 


3709 


A 


144 


417 


TQAMEGLLHYINPAHAISLLSALNEERLKGQLCD 
VIXIVGDQKFRAHKNVLAASSEYFQSLFTNKENE 
SQTVFQLDFCEPDAFDNVLNYIY 


3710 


A 


245 


688 


FGMLKNKGHSSKKDNLAVNAVALQDHILHDLQ 

LRNLSVADHSKTQVQKKENKSLKRDTKAIIDTGL 

KKTTQCPKLEDSEKEYVLDPKPPPLTLAQKLGLI 

GPPPPPLSSDEWEKVKQRSLLQGDSVQPCPICKE 

EFELRPQVFSIRG 


3711 

i 
l 


A 


3 

. 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRS 

TPAMMNGQGSTTSSSKNIAYNCCWDQCQACFNS 

SPDLADHIRSIHVDGQRGGVFVCLWKGCKVYNT 

PSTSQSWLQRHMLTHSGDKPFKCWGGCNASFA 

SQGGLARHVPTUFSOQNSSKVSSQPKAKEESPSK 

AGMNnXUUO^TXRRRSl^'i 

HRAICFNLSAE; IbLGKGHSVVFHSTVSILLi QJK 

YKTLQKNISTIISKSLKI 


3712 


A 


-> 

j- . 


344 


RATWHNAGKEREAVQLMAGAEKRVKASHSFLR 
GLFGGNTRIEEACEMYTRAANMFKMAKNWSAA 
GNAFCQAAKLHMQLQSKHDSATSFVDAGNAYK 
KADPQGKTARHVACYLCV 


3713 


A 


20 


974 

• 


GAAATACSSSSSSSGAPATWAAHGPGKDVASPS 

SVSI^PRRSRIXVIJICGLRRNPERPSSSPALRRLL 

LLLLLLLLLLLGFLLSPGPERGVGGGRFGRRLAL 

LWAAALGHWSGKVMSRRAPGSRLSSGGGGGG 

TNYSRSWNDWQPRTDSASADPGNLKYSSSRDRG 

GSSSYGLQPSNSAWSRQRHDDTRVHADIQNDE 

KGGYSVNGGSGENTYGRKSLGQELRVNNVTSPE 

FTSVQHGSRALATKDMRKSQERSMSYCDESRLS 

YLUlRTraENDRDRRLATVKQLKEnQQPENKLV 

LVKQLDILAAVHDVLNER 


3714 


A 


237 


458 


IFAIJK5PSYLLPCCITEGKMDHKQLCWSHPQKSG 
QSSRS(XI(^NQHGLIWKYSLNMCLQCCHQYVK 
DIGFIKL 


3715 


A 


970 


1524 


IXrTLSPGISGTAGSCLTIEPGTELGTSFAQNGFYH 

EAVVLFTQALKLNPQDHRLFGNRSFCHERLGQP 

AWALADAQVALTLRPGWPRGLFRLGKALMGLQ 

RFREAAAVFQETLRGGSQPDAARELRSCLLHLTL 

QGQRGGICAPPLSK3ALQPLPHAELAPSGLPSLRC 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cysteine, D-Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=dyclne, H=HlstJdliie, 
I=Isoleudnt, K^Lysine, D=Leudne, M=Methtoulne, 
N=Asparagine, P=Proline, Q=Glntamlne, R=Arginine, S-Serine, - 
T»Tbreonine> V»Valine, W«Tryptophaa, Y=Tyrosine, 
X>=Un known, *=Stop codon, /^possible nodeotide deletion, 
Y=possiMe nudeotide Insertion 










PRSTALRSPGLSPLLH 


3716 


A 


85 


308 


QGLPSTMVKLGCSFSGKPGKDPGDQDGAAMDS 

WLISPLDISQLQPPLPIXJVVIKTQTEYQLSSPDQQ 

NYTKSR 


3717 


A 


58 


618 


GAGCTSPGLWARKAAARCLPTYPSRAQPSNVGR 

RRKRRPGLGALAAGVPAMAESVERLQQRVQELE 

RELAQERSLQVPRSGDGGGGRVRIEKMSSEWD 

SNPYSRLMALKRMGIVSDYEKIRTFAVAIVGVGG 

VGSVTAEMLTRCGIGKLLLFDYDKVELANMNRL 

FFQPHQAGLSKVQAAGHTPEE 


3718 


A 


3 


593 


RGAGGRAGGRADGQPNMADQRQRSLSTSGESL 

YHVLGLDKNATSDDIKKSYRKLALKYHPDKNPD 

NPEAADKFKEINNAHAILTDATKRNIYDKYGSLG 

LYVAEQFGEENVNTYFVLSSWWAKALFVFCGLL 

TCCYCCCCLCCCFNCCCGKCKPKAPEGEETEFY 

VSPEDLEAQLQSDEREATDTPIVIQPASATEP 


3719 


A 


2 


2173 


SGGVRMGSRAIXjPRTSGHVTGKMAVFPWHSRN 

R2WKAEFASCRLEAVPI^GDYHPLKPnVTESK 

TKKVNRKGSTSSTSSSSSSSWDPLSSVLDGTDPL 

SMFAATADPAALAAAMDSSRRKRDRDDNSWG 

SDFEPWTNKRGEILARY ITIEKLSINLFMGSEKG 

KAGTATLAMSEKVRTRLEELDDFEEGSQKELLN 

LTQQDYVNRIEELNQSLKDAWASDQKVKAPKN 

VHPGKLVYERIFSMCVDSRSVLPDHFSPENANDT 

AKETCLNWFFKIASIRELIPRFYVEAS CLKCNKFLS 

KTGISECLPRLTCMIRGIGDPL\GSVYARAYL\SRV 

GMEVAPHLKETLNKNFFDFLLTFKQIHGDTVQN 

QLWQGVELPSYLPLYPPAMDWIFQCISYHAPEA 

LLTEMMERCKKLGNNALLLNSVMSAFRAEFIAT 

^SMDHGMIKECDESGFPKHLUTlSLGLNlJUAD 

LYTOCHFTKREVi rvi^\/lKHMTPDRAFEDSY 

PQLQLHKKVIAHFKT vSVLFSVEKFLPFLDMFQK 

ESVRVEVCXCmmS3LlCSPPRTRSS*MPFCMF 

ARPCMTL/CNALTLEDEKR^/iLSYLINGFIKMVSF 

GRDFEQQLSFYVTSRSMFCNLEPVLVQLIHSVNR 

LAMEIRKVMKGNHSRKTAAFVRSWGAYWFTnP 

SLAGIFTRLNLYLHSG 


3720 


A 


24 


296 


ENIJRAGFAFSLLRSSFYISKTYCSWFSNLISGSL 

ADFNSKGTRDYSPRQMAVRE/KVFDVIIRCFKRH 

GAEVIDTPVFELKVRNGQEBTTW 


3721 


A 


2 


310 


PSCLTC^GHCSIGGSCTMIGIMNO>ECHCSlJrD^G 
PRCEEHVFIIX^PGHIASIIJPLLVLLLLALVAGVV 
FWHKRRVQGAKGFQHQRMTNGAMNVEIGNPTY 
K 


3722 


A 


75 


722 


MELVAGCYEQVLFGFAVHPEPEACGDHEQWTL 

VADFTHHAHTASLSAVAVNSRFVVTGSKDET1HI 

YDMKKKIEHGALVHHSGTITCLKFYGNRHLISGA 

HXjLICTWDAKKWECLKSIKAHKGQVrFl^IHPS 

GKLAI^VGTDKTLRTWNLVEGRSAFDCNIKQNA 

HIVEWSPRGEQYVVnQNKIDIYQUyrASISGTITN 

EKRISSVKFLSES 


3723 


A 


110 


316 


MEl^DNRRSGGLEGlJVEKCraLTYLNLSGNKIK 
Dl^TVEALVSGTVLSLDLLFLVKFSEICLCLLlSI 


3724 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 
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SEQH> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Aianine OCysteine, D=Aspartic Add, 
B=GIutamic Add, ^Phenylalanine, G^Glyeine, H-Histidine, 
IHfeoIeudne, K=Lysine, J>Leudne, M=Mcthionine, 
N-Asparagine, P*=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Vaiine, W=Tryptophan, Y«Tyrosinc, 
X=Un known, *=Stop codoo, ^possible nadeotide deletion, 
H>ossJb]e Dudeotide insertion 










VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 
FLAFREVPFLLELVQQLREKETOLK1PQVIXVDGN 
GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 
DG 


3725 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 

VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 

FIJ^REVPFLLELVQQIJIEKEPGIAIPQVLLVDGN 

GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 

DG 


3726 


A 


1 


433 


SSDDRSLFRRLKLNYAIFDEGHMLKNMGSIRYQ 

HLMTINANNR1JXTGTPVQNNLLELMSLLNFVM 

PHMFSSSTSEIRRMFSSKTKSADEQSrraKERIAH 

AKQDKPrTLRRVKEEVIJCQLPPKKDRIELCAMSE 

KQEQLYLG 


3727 


A 


6 


383 


RJOPRGKACXTVLGRSTGELEGFASSRLPPQPCGW 
GQSSDLLSRIDLDELMKKDEPPLDFPDTLEGFEY 
AFOTKGQLRinKTGEPFVFNYREHLHRWNQKRY 
EALGEIITKYVYELLEKDCNSKKVS 


3728 


A 


3 

7; 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSUSE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDE1THDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAWRPKVHYARPSHPPPD 

PPDLEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHSATPERLVRSRSSVDIVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDKNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQ VAEDILDKYRNAIKRTSPSDGAM/ NV£ST 

c,l:r-DGESAiayPRnEALO^^G;ADDLPJ,;.:/.SC. - 

AHPQDSAFSYRDAKKKLRLj LC^ADSVAFPVLTV 

HSTRNGU>DHTDPEDNETVC^KVQIAEAINLQD 

KbJLMAQLQETMRCVORFDNRTCRKLLASIAEDY 

IOCRAPYIAYLTOCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFITVCVRLLLBSKEKKIREnQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEHIQRI^KVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLAI^SVPGADDFVI^VFVUKANPPCLI^TV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3729 


A 


3 


2452 


EIAGAAAENMLGSIXCIiH3SGSVIJJ)PCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKBEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHDBAEADMRIQLSSSAHQLTSPPSQSE 

SIXAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHSATPERLVRSRSSVDIVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APJPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDIIJ)KYKNAIKRTSPSDGAMANYEST 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanlne OCysteinc, D^Aspartic Add, 
E=Giutamfc Add, ^Phenylalanine, G^GIydne, H=HIstidinc, 
PHboleudne, KHLysJne, l/=Leudne, M^Methionine, 
N-Asparaginc, P*=Proline, Q^GIutamlne, R^Arglnine, S=Serine, 
^Threonine, V=»Valine, W«Tryptopban, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V-possible nudeotide insertion 










EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFITVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKI^FYPNQDGDILR 

I3QVLHEHIQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKIPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEF1KTIDD 

RK 


3730 


A 

- ■ " : - 


3 


2452 


EIAGAAAENMLGSIJLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAWRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHSATPERLVRSRSS^DP/SSVRRPMSDPSWNRR 

RGNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVL*n 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIA YLTRCRQGLQTTXJAHLERLLQRVLR 

■ r~Z5 VANKYFTr ^'^^LLLESKEICKirJETlQDFOK 

i^rA/iODKTAQVEDFLQFLYGAMAQDVIW* ; 

SEQLQDAQlAEERSVMNRiFKIAFYFNQDGDILR 

i^O^TJIErnQRl^KVVTAMiRALQIPEVYlJlE^ 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3731 


A 


1 


1305 


VNTAMHEAKLMEECDELVEnQQRKQMIAVKIK 

ETKVMKLRKLAQQVANCRQCLERSTVLINQAEH 

I1JCENDQARFIX3SAKNIAERVAMATASSQVLIPDI 

NFNDAFEOTALDFSREKKLLEGLDYLTAPNPPSIR 

EELCTASHDTITVHWISDDEFSISSYELQYTIFTGQ 

ANHSLYNSVDSWMIVPNIKQNHYTVHGLQSGTR 

YIFIVKAINQAGSRNSEPTRLKTNSQPFKLDPKMT . 

HKKLKISNDGIXJMEKDESSLKKSHTPERFSGTGC 

YVYGVLHNSDNS*MFISLSFPLSHRYAIGIAYKSA 

PKNEWIGKNASSWVFSRCNSNFVVRHNNKEML 

VDWPHIJCRLGVLLDYDNY/NMLSFYDPANSL\H 

LHTFDVTFVILPVCPTFTIWNKSLMIL^GLPAPDFI 

DYPERQECNCRPQESPYVSGMKTCH 


3732 


A 


127 


2832 


1X}QR1^LWRPSIJGUILGKRLSLGLRERMMSLW 
WS/GPKVRTQATTGARPKTETKSVPAARPKTEAQ 
AMSGARPKTEVQVMGGARPKTEAQGITGARPKT 
DARAVGGARSKTDAKAIPGARPKDEAQAWAQS 
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S£QH> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartlc Add, 
E=Glutamic Add, ^Phenylalanine, OCrydne, H=Histidine, 
I=IsoIeodne, K=Lysine, I^Leacine, M=Methionlne, 
N^Asparagine, P-Proline, Q-Glntamlne, R-Arginine, S=*Serine, 
T-Threonine, V-Valinc, W»Tryptophau, Y-Tyrosine, 
X*4Jnknown, *=Stop codon, /^possible nodeotide ddetion, 
V=possibie nodeotide insertion 


- 








EFGTEAVSQAEGVSQTNAVAWPLATAESGSVTK 

SK\ACL WIEN* SMWM/PETFPGTQGQKGIQP WFG 

PGEETNMGSWCYSRPRAREEASNESGFWSADET 

STASSFWTGEETSVRSWPRBESNTRSRHRAKHQT 

NPRSRPRSKQEAYVDSWSGSEDEASNPFSFWVG 

ENTONLFRPRVREEANIRSKLRTNREDCFESESED 

EFYKQSWVLPGEEANXIDSGTETKKCLn^PWKLRA 

QKDVDSDRVKQEPRFEEEVnGSWFWAEKEASLE 

GGASAICESEPGTEEGAIGGSAYWAEEKSSLGAV 

AREEAKPESEEEAIFGSWFWDRDEACFDLNPCPV 

YKVSDRFRDAAEELNASSRPQTWDEVTVEFKPG 

LFHGVGFRSTSPFGIPEEASENn^EAKPKNLELSPE 

GEEQESLLQPJXJPSPEFTFQYDPSYRSVKEEREHL 

RARESAESESWSCSCIQCELKIGSEEFEEFLLLMD 

KIRDPFIHEISKIAMGMRSA SQPTRDFIRDSG WS 

LIETLLNYPSSRVRTSFLENMIHMAPPYPNLNMIE 

TOCQVCEETLAHSVDSl^QLTGNKGCFRHLTMT 

roYHmiAN*YGPGFPIJLF*PQAQCGETKFHVLK 

MLLNLSENPAVAKKLFSAKALSEFVGLFNIEETN 

DNIQIVIKMFQNISNIIKSGKMSLIDDDFSLEPLISA 

FREFEELAKQLQAQIDNQNDPEATGTTAFVGKG 

NNPSANRERLSPSVFCPGAQEAESLPARRVRGEE 

QRLLLEEVGARTADGIPEGW 


3733 


A 


2 


3274 


DWLIRffiEDTGEIFTTGARIDREKLCAGIPRDEHC 

FYEVEVAILPDEIFRLVKIRFLIEDINDNAPLFPAT 

VTNISIPENSAINSKYTLPAAVDPDVGINGVQNYE 

LDCSQNIFGLDVIETPGGDKMPQLIVQKELDREEK 

DTYVMKVKVEDGGFPQRSSTAILQVSVTDTNDN 

HPVFKETEIEVSIPENAPVGTSVTQLHATDADIGE 

NAKXHFSFSNLVSNIARRLFHLNATTGLITIKEPLD 

REEl T ItiWl -Us 7 1 ASjy:?" !>T;" ; RAMVLVl -vrw 

MJN^TSroiRYIVNPV;^ 

VTDKDADHNGRVTCFiT. HEIPFRLRPVFSNQFLL 

ETAAYLDYESTKEYAIKLLA r>DAGKPFLNQSAM 

LFIKVKDENDNAPVFTQSFV'i 7SI?ENNSPGIQLT 

KVSAMDADSGPNAKINYLLGPDAPPEFSLDCRT 

GMLTVVKKLDREKEDKYLFTTLAKDNGVPPLTS 

NVTVFVSIIIXJNDNSPVFTHNEYNFYVPENLPRH 

GTVGLITVTDPDYGDNSAVTLSILDENDDFIIDSQ 

TGVIRPMSFDREKQESYTFYVKAEDGGRVSRSSS 

AKVTINVVDVNDNKPVFIVPPSNCSYELVLPSTN 

PGTWFQVIAVDNDTGMNAEYRYSIVGGNTRDL 

FAIIXJETGMTLMEKCDVTDLGIJIRVLVKANDL 

GQPDSLFSVVIVNLFVNESVTNATLINELVPQKH 

LKHQ*PQIl^IADVSSPTSDYVKILVAAVAGTnV 

VVVnTTAVVRCRQAPHIJCAAQKNMQNSEWATP 

NPEl^QMIMMKKKKKKKKHSPK^LLNVVTIEE 

TKADDVDSDGNRVTLDLPIDLEEQTMGKYNWV 

TTPTTFKPDSPDLARHYKSASPQPAFQIQPETPLN 

LKHHIIQELPLDNTFVACDSISNCSSSSSDPYSVSD 

CGYPVTTFEWVSVHTRPPVDLEVGGAQSGQVAI 

LTSSUV1ELIXO.MVAAFLPLELRPLGQQNVMSW 

EQEAKELLVGYWGDGEWCHFHFHHLIPGPVNPG 

YERKQYHILDSDSEDTQPSGELCPIPVRPFTILSIQ 

LLQDDGEHCGTKQGFQPAVQLGLLPHKTLK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cys trine, D=Aspartic Add, 
BXJIatamfc Add, ^Phenylalanine, G^GIydne, H=Histidine, 
I=Isoleudne, K-Lysine, L^Lendne, M=Methiontoe, 
N^Asparagine, P**ProHne» Q=Glutamine, R=Arginine, S^Serine, 
T^Threonine, V=Valine, W^ryptophan, Y^iyrosiDe, 
X=Unknown, *«Stop codon, A=possible nudeotide deletion, 
V=possible nodeotidc insertion 


3734 


A 


1 


840 


GTRPGHLJ , APSIXjFCV/HL*SIPSWGSF*GES17EM 

QLH^LGUJEFDIAM^l^IYAQTLVWIGIFFCPL 

I^HQMIMIJTMFYSKI^LMMNFQPPSKAWRAS 

QMMTFFIFLLFTPSFTGVUri^ 

GPFRGLPLFfflSIYSWIDTLSTRPGYLW 

UGSVHFFFILTLIVLIITYLyWQITEGRKIMIRL 

EQHNEGKDKMFLIEKLIKLQDMEKKANPSSLVLE 

RREVEQQGFLHLGEHDGSLDLRSRRSVQEGNPR 

A 


3735 


A 


2 


432 


VEVCRRYLWKMTVDASQNVQCCVIFSHFPFIFN 

m^SKlKLLHTDTLLKIESKKHKAYLRSAAlEEERE 

SEFALIU^DLTVRRNHLIEDVLNQLSQFENEDL 

RKELWVSFSGEIGYDIX5GSAnKKEIFYCLFAEMIQ 

PEYGMFMY 


3736 


A 


1542 


343 


KGAPSFVIUL»YQYPNFAGPHAALANKSFFKADKV 

TMLWNKKATAVLVIASTDVDKTGASYYGEQTL 

HYIATNGESAWQLPKNGPIYDWWNSSSTEFCA 

VYGFMPAKATIFNLKCDPVFDFGTGPRNAAYYS 

PHGHILV1^GFGNLILQI*AD/IMKVWNVKN^ 

SKPVASDSTYFAWCPDGEHILTATCAPRLRVNN 

GYKIWHYTGSILHKYDVPSNAELWQVSWQPFLD 

GIFPAKTITYQAVPSEVFNEEPKVATAYRPPALRN 

KPITNSK1JIEEEPPQNMKPQSGNDKPLSKTALKN 

QRKHEAKKAAKQEARSDKSPDLAPTPAPQSTPR 

NTVSQSISGDPEIDKKIKmKKKLKAIEQLKEQAA 

TGKQLEKNQLEKIQKETALLQELEDLELGI 


3737 


A 


3190 


664 


- VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEETTTTIITTTTVTTTVTSPVLC 
NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSMV 
yrv,YriffiIQVQTL^?OEEEIX\^\GGC3PGlJVP 
RLLAi ISSMLGEGQ VLRSPTNRLLLHFQSPR W ? i 
C; GFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 
TATRICDSGYQLQGEETLICLNG1RPSWNGETPS 
CMASCG GTIHNATLGRIVSPEPGG AVGPNLTCR 
WVIEAAEGRRLHLHFERV SLDEDNDRLMVRSGG 
SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 
PANPLLLSLRFEAPEEDRCFAPFLAHGNVTTTOPE 
YRPGAIATFSCLPGYALEPPGPPNAJECVDPTEPH 
WNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 
SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 
TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 
QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 
PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 
DILTCQWDLSWSAAPPACQKIMTCADPGEIANG 
HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 
YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 
YQTLYKHHYQAGESLRFFCYEGFELIGEVTTTCV 
PGHPSQ\VTSQPPLCKAnTQTTDPSRQLEGG2^LAL 
AILLPLGLVIVIXjSGWIYYTKLQGKSLFGFSGSH 
SYSPITVESDFSNPLYEAGDTREYEVSI 


3738 


A 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEETTTTnTTTTVTTTVTSPVLC 
NNN1SEGEGYVESPDLGSPVSRTLGLLDCTY SMV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide , 
sequence 


Amino acid sequence (A-Alanine OCysteine, I>=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G=GIydne, H^Htetidme, 
Msoleudne, K^Lysine, l^=Leodne, M»Methionlne, 
N=Asparagine, P-Proline, Q^Glutamlne, R-Argltrine, S=Serine, 
T^Threonine, V-Valine, \V~Tryptophan, Y-Tyrosint, 
X«Unknown, *=Stop codon, A=possible nudeotide deletion, 
\=possibie nudeotide insertion 










YPGYGffilQVQTLNLSQEEELLVLAGGGSPGLAP 

RLU^SSMLGEGQ\OJlSPTNRLLXHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PA^lXl^LRFEAFEEDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 

SPGQDCVWGVHVQEEKREXQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKFVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DILTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

AILLPLGLVIVUjSGVYIYYTKLQGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3739 


A 


734 


445 


LLEPEPAEEYTEQSEVEST/EGMIU*CCLYFAAFQ 
TNVSNrYFALQYVNRQFMAETQFTSGEKEQVDE 
WTVETVEVRVLCIAKLLSLSSVSNFYLY 


3740 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLWILDGSYSVGPENFE1VKKWLVNITKNF 

DIGPKFIQVGVVQYSDYPVLEEPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAWLTDGKSQDDVKDAAQAARDSKITLFAIG 

VGSETEDAELRAL 4 NKPSSTYVFYVEDYIAISKIR 

E\^OKLCEESVC?^!PV t AA^DEr£r:;r 7#LD •< 

FPEGU>PSYVFVSTQRFKVKKIWDLWk: judgj* 

PQIAVTLNGVDKILLFTTTSVINGSQVV'i F Ar 

KTLFDEGWHQIRLLVTEQDVTLYIDDQQIEK r Kr\. 

HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPG1QGARGL. 

PGYKGEPGRDGDK 


3741 


A 


5048 


1236 


MSAPAGSSHPAASARIPPKFGGSAVSGAAAPAGP 

GAGPAPHQQNGPAQNQMQVPSGYGLHHQNYIA 

PSGHYSQGPGKMTSLPLDTQCGDYYSALYTVPT 

QNVTPNTVNQQPGAQQLYSRGPPAPHTVGSTLGS 

FQGAASSASHLHTSASQPYSSFVNHYNSPAMYS 

ASSSVASQGFPSTCGHYAMSTVSNAAYPSVSYPS 

LPAGDTYGQMFTSQNAPTVRPVKDNSFSGQNTA 

ISHPSPLPPLPSQQHHQQQSLSGYSTLTWSSPGLP 

STQDl^IRKHTGSIAVANNNFnTVADSLSCPVM 

QNVQPPKSSPWSTVLSGSSGSSSTRTPPTANHPV 

EPVTSVTQPSELLQQKGVQYGEYVNNQASSAPT 

PLSSTSDDEEEEEEDEEAGVDSSSTTSSASPMPNS 

YDALEGGSYPDMLSSSASSPAPDPAPEPDPASAP 

APASAPAPVVPQPSKMAKPLAMAIQHFSLVIRML 

QHHLFLEYSPSNPVYSGFQQYPQQYPGVNQLSSS 
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S£QD> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue or 
peptide 
sequence 


Amino add sequence (A*-Alanf ne OCysteine, D^Aspartk Add* 
E=Clotamic Add, ^Phenylalanine, 0=Gtycine, H=Hlstidine, 
f=Isoleudne, KpLystne, L=Lendne, M=MethtonlQt, 
N-Asparaglne, P^Prollne, Q=Glutaminc, R=»Arginme, SHSerine, 
TtoThreonine, V»Vallne t W-Tryptophan, Y=Tyroslne, 
X=l)nknown t *«Stop cod on, ^possible nndeotide ddetion, 
\=possible nudeotfde insertion 










IGG1^U5SSPQPESLRPVNLTQERMIJ>MTPVWAP 

VPNLNADLKKLNCSPDSFRCTLTOPQTQALLNK 

AKU ) LGLLLHPFRDLTQLJ > VlTSKnVRCRSCRTYI 

NP\FVSFroQRR*KCmCYRVNDVPEEFMYNPLT 

RSYGEPHKRPEVQNS\TVEFIASSDYMLRPPQPAV 

YLFVLDVSHNAVEAGYLTI/LWCQSLLEVNLDKLP 

G\DSRTARIGFMTTO\STYSFLQFTQEGLSQPQMLI 

VSDIDDVI^PTPDSLLVNLYESKELIKDIJLNALPN 

MFTbHTlETHSALGPALQAAFKLMSPTGGRVSVF 

QTQLPSLGAGLLQSREDPNQRSSTXWQHLGPAT 

DFYKKLALDCSGQQTAVDLFLLSSQYSDLASLA 

CMSKYSAGCIYYYPSFHYTHNPSQAEKLQKDLK 

RYLTRKIGFEAVMRIRCTKGLSMHTraGNFFV 

TDLLSLANINPDAGFAVQLSIEESLTDTSLVCFQT 

ALLYTSSKGERRIRVHTLCLPVVSSLSDVYAGVD 

VQAAICLLANMAVDRSVSSSLSDARDALVNAW 

DSLSAYGSTVSNLQHSALMAPSSLKLFPLYVLAL 

LKQKAFRTGTSTRLDDRVYAMCQIKSQPLVHLM 

KMIHPNLYRIDRLTDEGAVHVNDRIVPQPPLQKL 

SAEKLTREGAFLMDCGSVFYIWVGKGCDNNFIE 

DVLGYTNFASIPQKMTOLPELDTLSSERARSFIT 

WLRDSRPLSPILHTVKDESPAKAEFFQHLIEDRTE 

AAFSYYEFLLHVQQQICK 


3742 


A 


934 


68 


SMLASQGVLLHPYGVPMIVPAAPYLPGLIQGNQE 

AAAAPDTMAQPYASAQFAPPQNGIPAEYTAPHP 

HPAPEYTGQTTVPEHTLNLYPPAQTHSEQSPADT 

SAQTVSGTRNKQD*RSTDGWPSPKTQTS*KHGK 

QVSSPSGLHVSNIPFR\FRDPDLRQMFVGQFGKILD 

VEDFNERGSKGFGFVTFENSADADRAREK\LHGT 

VV\EGRKI\EVN\NATARVMTNKKTVNF 

T.Nr^G/A'y3P^AGTVIiCQA<:'^P^^^ 

PSTDFRGAKLHTSRr 1 *.; ; H-1 


3743 


A 


3 


1456 


QFQQAWMQNKVPffAPhffiVO^DRKEDIKLEBKK 

KTQAEIEQEMATLQYTNPQLLEQLKIERLAQKQV 

EQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGF/PTA 

PSISADANEHGSNKGPPGPQGQFRPPGPQGQMGP 

QGPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQD 

MHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQG 

HMGPQGPPGPQGHIGPQGPPGPQGHLGPQGPPGT 

QGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGP 

VSQGPLMGLNPKGMQGPPGPRENQGPAPQGMI 

MGHPPQEMRGPHPPGGLLGHGPQEMRGPQEIRG 

MQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSL 

GPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQ 

QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQ 

GAQGRIPPLNPGQGPGPNKVS/ERGAPPRHEGRA 

PPRGRDGFPGPMKTLV 


3744 


A 


1571 


652 


pltgrkcpgwthsgsrrspriaeevpgfpkraea 

srqfsetadrlellrravmaaarattpadgeep 

apeaealaaarerssrflsglelvkqgaearvfr 

grfqgrXavikhrfpkgyrhpalearlgrrrtv 

qearallrcrragisapwffvdyasnclymeei 

egsvtvrdvifsplwrlkktpqgi^nlj^ktigqvl 

armhdedlihgdlttsnmlijcppleqln1vlidf 

glshsauedkgvdlyvlekaflsthphttetvfe 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A~ Ala nine OCysteine, D=Aspartic Add, 
fc=Glatamic Add, ^Phenylalanine, OCIyrine, BHfflstidlne, 
I-kolendne, K=Lysine J L^Leudne, M=Methionine, 
N^Asparagine, P=*Pro!ine, Q=Glutamine, R»Argtnine, S-Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyro5ine, 

YctTnlrrMiwn rCfnn rnrinn /=rwwt!hlp' miplMifift* HoI*H#>t> 
*m^\i utkMWjf Uf ^&iv}r wuuuy i^nnMUic uuumuuc QCSCQOIla 

V-possible nudeotide insertion 










AFIXSYSTSSKKARPV1JCKLDEVRLRGKKRSMV 
G 


3745 


A 


127 


1433 


GSHRFSLASPIi)PEVGPYCDTPTMRTLFNLLWLA 

l^CSPVHTTLSKSDAKKAASKTLLEKSQFSDKPV 

QDRGLVVTDLKAESVVLEHRSYCSAKARDRHFA 

GDVLGYVTPWNSHGYDVTKVFGSKFTQISPVWL 

QIJCRRGREMFEVTGLHDVIXJGWMRAVRKHAK 

GL\P*CLGSCLRTGLTMISG/YVLDSEDEIEELSKT 

WQVAKNQHFDGF^TEVWNQLLSQKRVGLIHM 

LTH1AEALHQARLLALLVIPPA1TPGTDQLGMFT 

HKEFEQLAPVLDGFSLMTYDYSTAHQPGPNAPL 

SWVRACVQVLDPKSKWRSKILLGLNFYGMDYA 

TSKDAREPWGARYIQTLKDHRPRMVWDSQVSE 

HFFEYKKSRSGRHVVFYPTLKSLQVRLELARELG 

VGVSIWELGQGLDYFYDLL*VGIAASAVDVFFSK 

PWSE 


3746 


A 


1 


898 


idraaecrtkplpmavsirgnadsivaclvijv1vl 

ylikkrlvacaavfygfavhmkiypetyilprrl 

hllpdrdndkslrqfrytfqacl*ellkrlcnrt 

almfvavagltffai^fgfyyeygweflehtyf 

yhltrrdirhnfspyfymlyltaeskwsfslgia 

aflpqlillsavsfayyrdlwcwfurisifvtfn 

kvctsqyflwyix:lijplvmplvrmpwkravvl 

lmlwfigqamwlapayvlefqgkntflfiwla 

glffllincsiliqushykeeplterikyd 


3747 


A 


1 


2325 


MVISFQGLVTFGDVAVDFSQEEWEWLNPIQRNL 

YRKVMLENYRNLASLGLCVSKPDVISSLEQGKEP 

WTVKRKMTrlAWCPDLKAVWKIKELPLKKDFCE 

GKLSQAVITERLTSYNLEYSLLGEHWDYDALFET 

QPGL VTIKNL A VDFRQQLHPAQKNFCKNGIWEN 

NSDLGSAGi: . AJTDL^LI .ZTZ~* 

7 ^LrSGQRS VHETQELFPKQI \ .' Y Aw 7TDRTSN 

TKLDCS SFREN WDSD YVFGRKL/ /GQETQFRQE 

PITHNKTLSKERERTYNKSGRWFYL^^SEEKVH 

NRDSIKNFQKSSVVIKQTGIYAGKKLFK': :;ECKK 

TFTQSSSLTVHQRIHTGEKPYKCNECGKAFSDGS 

SFARHQRCHTGKKPYEOECGKAFIQNTSLIRHW 

RYYHTGEKPFDCIDCGKAFSDfflGLNQHRRIHTG 

EKPYKCDVCHKSF\RYGSSLTVHQRIHTGEKPYE 

CDVCRKAFSHHASLT\Q\HQRVHSGEKPFKCKEC 

GKAFRQNIHLASHLRIHTGEKPFECAECGKSFSIS 

SQIATHQRIHTGEKPYECKVCSKAFTQKAHLAQ 

HQKTHTGEKPYECKECGKAFSQTTHLIQHQRVH 

TGEKPYKCMECGKAFGDNSSCTQHQRLHTGQRP 

YECIECGKAFKTKSSLICHRRSHTGEKPYECSVC 

GKAFSHRQSLSVHQRIHSGKKPYECKECRKTFIQI 

GHLNQHKRVHTCERSYNYKKSRKVFRQTAHLA 

HHQRIHTGESSTCPSLPSTSNPVDLFPKFLWNPSS 

LPSP 


3748 


A 


823 


1 


GGYTKSGYDSACKDFVPHDLEVQIPGRVFLVTG 

GNSGIGKATALEIAKRGGTVHLV 

RGEIIRE\SGNQNIFLHIVDI£DPKKIWK^ 

EHKLHVL\V>WAGCMVNKREAHKKMDre 

CQYSGVCmTTl^DPLCWRKNTDPRVmVSSG 

GMLVQKLNNQ*SPVRIOmWMGTNlVYAQNKVS 
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S£QD> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartfc Acid, 
E=Glutamic Add, F^Phenylalanine, OGr/dne, H=Histidlne, 
^Isoleudne, K>=Lysine, L^Leudnc, M=Methionlne, 
N=»AsparagIne, P^ProIine, Q=GIntaniine, R=Arginine, S^Serine, 
T=ThreonJne, V«Vallne, W^Tryptopban, Y^Tyrosme, 
X^Unknown, *^stop codon, A*possible nudeotide deletion, 
^possible nudeotide Insertion 










ERQQVVLT\ERWGPRAPG\IHFSSMHPGWA\DTPG 
VRQAMPGFHVQASGYRLRSEAQGADTMLWLAL 
SSARSRTAQRP 


3749 


A 


1939 


715 


GFLRLSQAT\RQRLSIPVMVLTIJ)PTRD\QCFGDR 

FSIQXLDEFIXjYDDIL\MSSVKGLAENEENKGFLR 

NWSGEHYRFV\SMWMART\SYLAAFANHGQSF 

TLSVSHACCGYSHHQIFV1TVT>LLQMLEMNMAIA 

FPAAPLLTVILALVGMEAIMSEFFKDTTTAFYIILI 

VWlJUXFTOAICCHTSTSKRH^ 

AYHYRFNGQYSSLALWSWLFIQHSMIYFFHHYE 

LPAILQHVRIQ\EMLLQAPTLGPGTPTA\LPDDMN 

NNSGAPATAFVDSAGQPPALGPVSPGASGSPGPV 

AAAPSSLVAAAASVAAAAGGDLGWMAETAA1IT 

DASFLSGLSASLLERRPASPLGPAGGLPHAPQDS 

VPPSDSAASDTTPLGAAVGGPSPASMAPTEAPSE 

VGS 


3750 


A 


2 


844 


GLLEPFSKLLSFV1QNAVFTLAYLVELCGLCYRA 

FTKERDKFYI^RSVVLEIXQALK^ 

LLVQFICADAGTKLAESTILSKQMIASVPGCGTA 

AMECWQYINEVLDFMVADMHTLTKLKSHMKTC 

SQPLHEDTFGGHLKVGLAQIAAMDISRGNHRDN 

KAVIRYLPWLYHPPSAMQQGPKEFIECVSHIRLL 

SWLLLGSLTHNAVCVLKWPPLPGLPPLDAGSHV 

ADHLIVILIGFPEQSKTSVL\HMCSLFHAF\SLAQL 

WDSLLARQSGRW 


3751 


A • 


431 


2 


AFTRKCEETAFIVPQCEIIPTEAWCRR1PTGSSLER 

NPGVKEGCEFCPPKVEMFFKDDANHDPQWSRQ 

QLIAAKFGFAALGI/QTEVDIMSHAT*AVFEIPBKS 

RL\PQNCTPVDMKIEFGVHVTSKEILTDVIDNDS* 

RHSPS 




A 


131 


1278 


AS'SGSQIZ :^V/3- rrASiv^NC^^CKK^FI^v^E^ 

PGGRWTe 71 SSDPAWAVEWIELPRGLSLSS 

LGSARTLRv ;VSRSSRPSSVDSQDLPEVNVGDTV 

AMLPKSRRALTi ^ BT AALARSSLHGISQ WKDHV 

TKFTAMAQGRVAriLIEWKGWSKPSDSPAALESA 

FSSYSDLSEGEQEARFAAGVAEQFAIAEAKLRA 

WSSVDGEDSTDDSYDEDFAGGMDTDMAGQLPL 

GPHLQDLFTGHRFSRPVRQGSVEPESDCSQTVSP 

DTLCSSLCSLEDGLLGSPARLA\PSCWAMSCFSPN 

CPPAGKVPSAAW/APLEAQDSLYNSPLTESCLSP 

AEEEPAPCKDCQPLCPPLTGSWERQRQASDLASS 

GWSLDEDEAEPEEQ 


3753 


A 


3 


1138 


YYSSVRQRVTCEEPRFRECAAAUEGSATEVYAG 

EWRADRRSGFGVSQRSNGLRYEGEWLGNRRHG 

YGRTTRPDGSREEGKYKRNRLVHGGRVRSLLPL 

ALRRGKVKEKVDRAVEGARRAVSAARQRQEIA 

AARAADALLKAVAASSVAEKAVEAARMAKLIA 

QDLQPMLEAPGRRPRQDSEGSDTEPLDEDSPGV 

YENGLTPSEGSPELPSSPASSRQPWRPPACRSPLP 

PGGDQGPFSSPKAWPEEWGGAGAQAEELAGYE 

AEDEAGMQGPGPRDGSPLLGGCSDSSGSLREEE 

GEDEEPLPPLRAPAGTEPEPIAMLVLRGSSSRGPD 

AGCLTEELGEPAATERPAQPGAANPLWGAVAL 

LDLSLAFLFSQLLT 


3754 


A 


2 


3338 


SSlJJBKMreSDKDFRFMATSDLMSELQKDSIQLD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cy&teine I D=Aspartic Add, 
E^GIntamic Add, F=Pbenylflianine, G=Glycine, K=HJstidIne, 
I=Isokndnc, K=Lyslne, L=Leudne, M=MethionIne, 
N^Asparagjne, P=ProIlne, Q=Glutamine, R^Argtnine, S==Serine > 
T=Tbreonine, V-Valine, W^Tryptophan, Y-Tyrosine, 
X«Unknown, *«Stop codon, /-possible nudeotide ddetion, 
V=possible nudeotide insertion 










EDSERKVVKMIXRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCIXPQLSSPRIAVRKRAVGALGH 

1ATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCUjSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDEIJRESCLQAFEAFLRKCPKEMGPHVPNVTS 

IX^QYIKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

l^DraCTLAP^mRFKEREEhrVKADVFTAYIV^ 

LRQTRPPKGWIJBAMEEPTXJTGSNLHMLRGQVPL 

WKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGDFSLADRSSSST1RMDALAFLQ 

GUX3TEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALWLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATOIJXJEVKERAISCMGHLVGHLGD 

R1XjDD1^PTLLLLLDRLR]«ITRLPAIKALTLVAV 

SPIX}U>LQPIIj\EALHILASFI^^ 

AIJDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQ\EAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 

EDVRAAASYALGRVGAGSLPDFLPFLLEQBBAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGWAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFLISDQPHPIDPLLK 

SFIAVHNKPSLVRDIXDDII^LLYQETKIRRDLJRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLIEPLRATCTA KVKAGS VKQEF 

EKQDELKRS^ t ^RAVA, ALLTIPE 7E T C^P^ iA :A FSS 

QIRSNPELAALFESiCf - - - -TS APii VDSMELS * \ 


3755 


A 


2 c 


3338 


SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 

EDSERKVVKMLLRLI^KNGEVQNLAVKWLGV 

PIXiAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

IATACSTDLFVEIADHIJLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYUCHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWljaAMEEPTQTGSNLHMLRGQVPL 

VVKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SIjVEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPnXPPVMACVADSFYKIA 

AEALWLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARIJ^TDLIXJEVKERAISCMGHLVGHLGD 

RLGDDI^PTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPIIAEALHILASrlJiKNQRAIJU^ATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQXEAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEV GQ VAGPGHERELKAVLLEALGSPS 
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SEQW 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCystdoe, D=Aspartic Add, 
E^Clutamle Add, F=Phenytalanine, G=Glydne, Mflstidine, 
Jslsoleudne, KpLysine, L^Leudne, M=MethIonjne, 
N=Asparagine, P-Prollne, Q=-Giutamine, R-Arglnine, S=Serine, 
T^Threonine, V=VaUne, W-Tryptophan, Y-TVrosine, 
X=OJn known, *=Stop codon, /^possible nudeotide ddetion, 
V=possibJe nudeotide Insertion 










EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGWAECIGKLVLVNPSFLLPRL 

RKQIAAGRPHTRST\aTAVKFLISDQPHProPLUC 

SFL\VHNKPSLVRDLLDDILPLLYQFIKIRRDLIRE 

VEMGPFKHTVDIXjLDVRKAAFECMYSLLESCLG 

QIJ)ICEFLNHVEIXjLKDHYDIRMLTFIMV 

1XIPAPVLQRVDRIJEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESIQKDSTSAPSTDSMELS 


3756 


A 


112 


1361 


SLEEQQGRHPSFAPKCASQILGRIMITLITEQLQK 

QTLDELKCTRFSISLPLPDHADISNCGNSFQLVSE 

GASWRGLPHCSCAEFQ/DQPQLQLPSLRPEPAPQ 

TTWGNSPKEQPFSQVLRPEPPDPEKLPVPPAPPS 

KRHCRSLSVPVDLSRWQPVWRPAPSKLWTPIKH 

RGSGGGGGPQVPHQSPPKRVSSiySVPPSSQCLFS 

MCPSSHTLQPSFLQPGPGP\DSSRPCAASPQSGSW 

ESDAESLSPCPPQRRFSLSPSLGPQASRFLPSARSS 

PASSPELPWRPRGLRNLPRSRSQPCDLDARKTGV 

KRRHEEDPRRLRPSLDFDKMNQKPYSGGLCLQE 

TAREGSSISPPWFMACSPPPLSASCSPTGGSSQVL 

SESEEEEEGAVRWGRQALSKRTLCQRDFGDLDL 

NLIEEN 


3757 


A 


413 


1 


PKPMLQQDFT/SLPDQGLDHIAE/NSYFDARSLCA 
AELVCKEWQQVTSE*MLWKKLEERMVHAYPLW 
KGLSEKVW/IXJHLFKNRPTDGPPNSFHRSLYPKII 
QVlEi 1J^NWQCG*HTLQRIQCHSEKSKGVYCLQ 
YDDEK 


3758 


A 


2 


613 


FVSGSPWRMDGSTERLEARRPAGRLPWSSRQEM 
TRR PSLMAGRQHG WSAQQSATV ANPVPG ANPD 
LIJ>HFLGn?ErVYIVKN£^^ \ 
KCNGEWVRQVDHVIERS1XK3SSGL1 LvifcVWNV 
SR(^VEKVFGLEEYWCQCVAWSSSGTTICSQKA 
YIRIAYLRKNFEQEPIAKEVSLEQGIVLPCRPPEGI 
PPAE 


3759 


A 


1 


561 


ADDTLHLWNLRQKRPAILHSIJQ^CRERVTFCHLP 

FQSKWLWGTERGNIHIVNVESFTLSGYVIMWN 

KAIELSSKSHPGPWHISDNPMDEGKLLIGFESGT 

WLWDLKSKKADYRYTYDEAIHSVAWHHEGKQ 

FICSHSIX3TLTIWNVRSPAKPVQTnTHGKQLKD 

GKKPEPCKPILKVEFXTTR 


3760 


A 


1 


824 


LPACRCGCVAGCPSNHGICRCLRASERQVCVMH 

LKHIJRTIXSPQIX5AAKYTCMAWSQNNAKFAVC 

TVDRVVLLYDEHGERRDKFSTKPADMKYGRKS 

YMVKGMAFSPDSTKMIGQTDNHYVYKIGEDWG 

DKKVI(MKHQTVKFRPWGTLG*ThnYQYIYL*IQ 

PGVAFLTSECDFSYCKDGASWLFMVICCLP*SPA 

VSFPIGD*\SAVTCLQWPAEYIIVFGLAEGKVRLS 

NTKTNKSSTIYGTESYWSLTTNCSGKGILSGHA 

DGYQR 


3761 


A 


2253 


320 


PVIQRCSQPYGFSLLISFFLKCVSETSQQPPSRKVF 
QLLPSFPTLTRSKSHESQLGNRIDDVSSMRFDLSH 
GSPQIllVRI^IGLSVraRFSTKSWI^QVCHVCQK 
SMIFGVKCKHCRLKCHNKCTKEAPACRISFLPLT 
RLRRTESVPSDINNPVDRAAEPHFGTLPKALTKK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to tost amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A^AIanine OCystelne, I>=Aspartic Add, 
E=GIutamic Add, ^^phenylalanine, G^Glydne, H=Histidine, 
I^lsoleodoe, K^Lysine, L^Leocine, etbioninc, 
N=Asparagine, P^Prolme, Q=Glutamtne, R^Arginine, S=Serine, 
^Threonine, V~Valine, W^Tryptophan, Y^Tyrodne, 
X^n known, *«Stop eodon, A=possibIe nudeotide deletion, 
V=possible nudeotide Insertion 










EHPPAMNHIJ)SSSNPSSTTreTPSSPAPFPTSSNPS 

SATIPIWSP\GQR\DSRFNFPSCyAYFIrm\Q\Qn 

FPDISAFAHAAPLPEAADGTRLDDQPKADVLEAH 

EAEAEEPEAGKSEAEDDEDEVDDLPSSRRPWRG 

PISRKASQTSVYLQEWDIPFEQVELGEPIGQGRW 

GRVHRGRWHGEVAIR1XEMDGHNQDHLKLFKK 

EVMNYRQTRHENVVLr^GACMNPPHLAIITSFC 

KGRTLHSFVRDPKTSLDINKTRQIAQEIIKGMGYL 

HAKGIVHKDUCSRNVFYDNG\KVViro 

GVVIAEGRRENQLKI^HDWLCYIAPEIVREMTPG 

KDEDQLPFSKAADVYAFGTVWYELQARDWPLK 

NQAAEASIWQIGSGEGMKRVLTSVSLGKEVSEN 

LSACWAFDLQERPSVFSLIJvlDMLEKI^KLNRRLS 

HPGHF*KSADINSSKWPRFERFGLGVLESSNPK 

M 


3762 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLWILIXjSYSVGPENFEIVKXWLVOT 

DIGPKFIQVGWQYSDYPVLEEPLGSYDSGEHLTA 

AVESILYUjGNTKTGKAIQFALDYLFAKSSRFLT 

KIAVVLTIXjKSQDDVKDAAQAARDSKITLFAIG 

vgsetedaelraiankpsstyvfyvedyiaiskir 

evmkqklceesvcptrtpvaardergfdillgld 

vnkkvkkriqlspkkikgyevtskvdlseltsnv 

fpeglppsywvstqrfkvkxiwdlwrtltidg/* 

pqiavtlngvdkbxfit^ 

ktlfdegwhqirllvteqdvtlyiddqqienkpl 

hpvlgilingqtqigkysgkeetvqfdvqklriy 

cdpeqnnretaceipgfclngpsdvgstpapcicp 

pgkpglqgpkgdpglpgnpgypgqpgqdgkpvs 

teslvisgisgitgyqgiagtpgvpgspgiqgargl 

K3YXGr?GFD^r . 


3763 


A 


3 


= 1267 


CKVWRNPLNA^-TRu^jT^YTWWGREPLTYYD 

Mh^AQDHQTi^TCDSDHLRPADAIMQKAWRE 

RNPQARISAAHEALb "NECATAYILLAEEEATTIA 

EAEKLFKQALKAGDGCYRRSQQLQHHGSQYEA 

QHSVLYLPLQVTRHQCLGVHQKKASNVCQKTRE 

DQGSSENDERFNEGVPPSEYVQYP*KPF\KALLEL 

QAYADVQAVLAKYDDISLPKSATICYTAALLKA 

RAVSDKFSPEAASRRGI^TAEMNAVEAIHRAVEF 

NPHVPKYLLEMKSLILPPEHILKRGDSEAIAYAFF 

HLAHWKRXflBGAI^JIXHCTWEG 

HLFYPYPICIETADRELLPSFHEVSVYPKKELPFFI 

UTAGLCSFTAMLALLTHQFPELMGVFAKAVSV 

CLEGGLGEWMGKAKGIKAA 


3764 


A 


25 


1032 


RSADGLCGNKDRERGNEFTRNQQAAQEWNPK 

KKMKKKKYVNSGTVTLl^FAVESECTFLDYIKG 

GTQINFTVAEDFTASNGNPSQSTSLHYMSPYQLN 

AYALALTAVGEHQHYDSDKMFPALGFGAKLPPD 

GRVSHEFPLNGNQENPSCCGIDGILEAYHRSLRT 

VQLYGPTNFAPWTHVARNAAAVQDGSQYSVL 

LIITDGVISDMAQTKEAIVNG\SKLPMSniVGVGQ 

AEFNAMVE1J3GDDVRISSRGKLAERDIVQFVPFR 

DYVDRTGNHVLSMARLARDVLAEIPDQLVSYM 

KAQG3RPRSPPAAPTHSPSQSPARTPPACPLHTHI 


3765 


A 


172 


3456 


LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLG 
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SEQTD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Aianine OCystdne, D°Aspartic Add, 
E^Glutamic Add, ^Phenylalanine, OGlydne, H^Hlstidine, 
X^Isoleodne, K^Lyslne, IHLeudne, MHVfethJonJne, 
N=Asparagine, P=Prolinc, Q-Glutamlne, R<-ArgtnJne, S=Serine, 
Threonine, V»Vaiine, W=0>yptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
Y=poislble nndeotide Insertion 








* ■-• 


KNFDSAKVPSDEYCPACKEKGKLKALKTYRISFQ 

ESIFLCEDL^IYPUjSKSLNNLISPDLEECHTPHK 

PQKRKSLESSYKDSLLLANSKKTRNYIAIDGGKV 

LNSKHNGEVYDETSSNLPDSSGQQNPIRTADSLE 

RNEILEADTVDMATTKDPATVDVSGTGRPSPQN 

EGCTSKLEMPLESKCTSFPQALCVQWKNAYALC 

WUX:n,SALVHSEElJ3nvroLCSKEESIFWRLL 

TKYNQANT1XYTSQLSGVKDGDCKKLTSEIFAEI 

ETCLNEN^EmSI^PQIJRCTLGDMESPVFAFPL 

IXKIJBTfflEIflLFLYSFSWDFECSQCGHQYQNRH 

MKSLVTFT^mPEWHPLNAAHFGPCNNCNSKSQI 

RKMVLEKVSPIFMLHFVEGLPQNDLQHYAFHFE 

GCLYQITSV1QYRANNHFITWILDADGSWLECDD 

IJCGPCSERHKKFEVPASEIHIVIWERKISQ\nrDKE 

AACLPIJCKTNIXJHAI^NEKPVSLTSCSVGDAAS 

AETASVTHPKDISVAPRTLSQDTAVTHGDHLLSG 

PKGLVDimPLTLEETIQKTASVSQLNSEAFL\LEN 

KPVAEOTGIIXTNTLLSQESLMASSVSAPCNEKLI 

QIXJFVDISFPSQVVNT7^QSVQL>TITOTVNTICS 

VNNTDATGLIQGVKSVEEEKDAQLKQFLTPKTEQ 

LKPERVTSQVSNLKKKETTADSQTTTSKSLQNQS 

LKENQKKPFVGSWVKGLISRGASFMPLCVSAHN 

RKITIDLQPSVKGVNNFGGFKTKGINQKASHVSK 

KARKSASKPPPISKPPAGPPSSNGTAAHPHAHAA 

SEVLEKSGSTSCGAQLNHSSYGNGISSANHEDLV 

EGQIHKLRIJCLRKKLKAEKKKLAALMSSPQSRT 

VRSENLEQVPQDGSPNDCESIEDLLNELPYPIDIA 

NESACTTVPGVSLYSSQTHEEILAELLSPTPVSTE 

LSENGEGDFRYLGMGDSH1PPPVPSEFNDVSQNT 

HLRQDH^CSPTKKNTC^ 

MJS^KTOIFDKvSSS^JA^ 

YLFENY 


3766 


A 


3 


16J.2 


AQQIVYRNVMLENYKNLVSLGYQLTKPDVILRL 

EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 

FKDKQSCDIKMEGMARNDLWYI^LEEVWKCRD 

QLDKYQENPERHUIQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVD^EYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNfflLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 

ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSF/VHSSRLIRHQR 

THTGEKPYECPECGKSI^QSTrlLILHQRTHVRVR 

PYECNECGKSYSQRSHLVYHHmTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRfflTGEKPYECCQCGKAITRKNDLIK 

HQRWVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 


3767 


A 


3 


1622 


A(^JIVYRNVMLENYKNL V SLG YQLTKPD VBLRL 

EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 

FKDKQSCDIKMEGMARNDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGN(XLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»AIanine OCysteine, D=>Aapartic Add, 
E=Glatamic Add, ^Phenylalanine, G=Glydne, H-Hfstidine, 
I"4soJeudne, K«=Lysice, IHLeucine, M-M ethionine, 
N=Asparagine, PHProline, Q=Glntamlnc, R^ArginJne, S=Serfne> 
T^Threonine, V»VaUne, W^Tryptopban, Y^Tyroslnt, 
X=Unknown, *HStop codon, /=possib!e nudeotide deletion, 
V=posdbIc oodeotidc insertion 










ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTrnX5DKLYTCNQCGKSF/VHSSRLIRHQR 

THTGEKPYECPECGKSFRQSTHLIIJHQRTHVRVR 

PYECNECGKSYSQRSHLWHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAFIRKNDLIK 

HQRIHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALV>rrSNLIGYQTNHIRENAY 


3768 


A 


185 


2258 


SinKMSRKISKESKXVMSSSLESEDISLETTVPTD 

DISSSEEREGKVRITRQLIERKELLHN1QLLKIELS 

QKTMMIDNLKVDYLTKIEELEEKLNDALHQKQL 

LTLRLDNQLAFQQKDASKYQELMKQEMETILLR 

QKQLEETNLQLREKAGDVRRSLRDFELTEEQYDC 

LiOUTEDQLSIPEYVSVRFYELVNPLRKEICELQV 

KKNHAEELSTNKNQlJwQLTETYEEDRKNYSEV 

QIRCQRLALELADTKQUQQGDYRQENYDKVKS 

ERDALEQEVIELRRKHEILEASHMIQTKERSELSK 

EVVTLEQTVTIXQKI)KEYLNRQNMEI^VRCAHE 

EDRLERLQAQLEESKKAREEMYEKYVASRDHY 

KTEYENKLHDELEQIRLKTNQEIDQLRNASREMY 

ERENRNLREARDNAVAEKERAVMAEKDALEBCH 

DQLLDRYRE\LQ\LSTESKVTEFLHQSKLKSFESE 

RVQLLQEETARNLTQCQLECEKYQKKLEVLTKE 

FYSLQASSEKRTTELQAQNSEHQARLDIYEBCLEK 

ELDEHMQTAEIENEDEAERVLFSYGYGANVPTT 

AKRRLKQSVHLARRVLQLEKQNSLDLKRSGTSK 

GPSNTAFTRSLTEANSLLNQTQQPYRYLIESVRQ 

RDSKIDSLTESIAQL/ERKDVSNLNKEKSALLQTN 

GIKMAL\DL\DQLLNHP 




A 


3 


2297 


DAAEERVY ;:r>AMKVi#FKPCT 

I .G^XKFWOGDTPLIENGK v v SliALwLSTKTDM 

VEKALLYRTVATGRDIIDKQHT QEASYGRDAF 

AKAIYERI^CWIVTRINDIIEVKN l r O T TIHGKNTV 

IG VLD1YGFEIFDNNSFEQFCINY CNBKLQQLFIQL 

VIJwQEQEEYQREGIPWKinDYFNNQIIVDLVEQQ 

HKGnAIIJDDACMNVGKVTDEMFLEALNSKLGK 

HAHFSSRKLCASDKILEFDRDFRIRHYAGDVVYS 

VIGFIDKNKDTLFQDFKRLMYNSSNPVLKNN1WP 

EGKLSrTEVTKRPLTAATLFKNSMIALVDNLASK 

EPYYVRCIKPNDKKSPQIFDDERCRHQVEYLGLL 

ENVRVRRAGFAFRQTYEKFLHRYKMISEFTWPN 

HDU>SDKEAVKKLIERCGFQDDVAYGKTKIFIRT 

PRTLFTLEEUIAQMLIRIVLFIXJKVWRGTLARMR 

YKRTKAALTHRYYRRYKVKSYIHEVARRFHGVK 

TMRDYGKHVKWPSPPKVLRRFEEALQTIFNRWR 

ASQLIKSBPASDLPQVRAKVAAVEMLKGQRADL 

GLQRAWEGNYLASKPDTPQTSGTFVPVANELKR 

KDKYMNVLFSCHVRKVNRFSKVEDRA1FVTDRH 

LYKMDPTKQYKVMKTIPLYNLTGLSVSNGKDQL 

VVFHTiaDNKDLWClJ^KQPTHESRIGELwGVL^ 

NHFKSEKRHLQVXNVTNPVQCSLHGKKCTVSVE 

TRLNQPQPDFTKNRSGFILSVPGN 


3770 


A 


3 


6276 


HKVAAPDWVFIIJDTVRHEAIJLYTWIAEHKPL 
VLCGPPGSGKTMTLFSALRALPDMEWGLNFSS 



430 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A»Alaotne OCystrine, D=Aspartlc Add, 
E=C!otamic Add, ^Phenylalanine, G=Ghydne, H=Hlstidint, 
I=Isoleudne, K=Lyslne, L=Leudne, M=Methionlne, 
N-*Asparagfne, P^ProlJne, Q=Glntamine, R»Argioine, S«Serine, 
•^Threonine, V«VaIine, W=Tryptophan, Y-Tyrosine, 
X-Unknown, *=Stop codoB,/=possible nndeotide deletion, 
\=possible nucleotide Insertion 


f 


• 


* 




ATTPELLIjCTTOHYCEYRRTPNGVVLAPVQLGK 

WLVLFCDEINIJDMDKYGTQRVISFIRQMVEHG 

GFYRTSDQTWVKLERIQFVGACNPPTDPGRKPLS 

rQU^RHWVVYVDYPGPASLTQryGTFNRAMLR 

LIPSLRTYAEPLTAAMVEFYTMSQERFTQDTQPH 

YIYSPREMTRWVRGIFEALRPLETLPVEGLIRIWA 

HEALRLFQDRLVEDEERRWTDENIDTVALKHFP 

NIDREKAMSRPILYSNWLSKDYIPVDQEELRDYV 

KARLKWYEEELDVPLVIJFNEVLDHVLRIDRIFR 

QPQGHLLLIGVSGAGKTTLSRFVAWMNGLSVYQ 

IKVHRKYTGEDFDEDIJITVLRRSGCKNEKIAFIM 

DESNVLDSGFI^RMNTLLANGEVPGLFEGDEYA 

TLMT(^KEGAQKEGLMLDSHEELYKWFTSQVIR 

NLHVVFTMKPSSEGIJQ>RAATSPALFNRCVLNW 

FGDWSTEALYQVGKEFTSKMDLEKPNYTVPDYM 

PVVYDKLPQPPSHREAIVNSCVFVHQTLHQANA 

RLAKRGGRTMAJTPRHYLDFI>^AN1JHEKRSE 

LEEQQMHLNVGLRKJKETVDQVEELRRDLRIKS 

QEI^VKNAAANDIQJCKMVKDQQEAEKKKVMS 

QEIQEQLHKQQEVIADKQMSVKEDLDKVEPAVI 

EAQNAVKSIKKQHLVEVRSMANPPAAVKLALES 

ICLLLGESTTDWKQIRSIIMRElSrFIPTIVNFSAEEIS 

D AIREKMKKNYMSNPS YNYEIVNRAS LACGPMV 

KWAIAQLNYADMLKRVEPLRNELQKLEDDAKD 

NQQKANEVEQMIRDLEASIARYKEEYAVLISEAQ 

ADCADLAAVEAKVNRSTALLKSLSAERERWEKT 

SETrTCNQMSTIAGEKXLSAAFIAYAGYFDQQMR 

QNLFTTWSHHLQQANIQFRTDLARTEYLSNADER 

LRWQASSl^ADDLCTENAIMLKRFNRYPLIIDPS 

GQATCFIMNEYKDRKITRTSF1JDDAFRKNLESAL 

rixnplv/qdve?ydpvl: ~ * i ev ?rtc-ct ~ 

LITLGDQDK);>; ;> WLSIIIDPIVEFPPDLC ; iv 

TFVNFIVTRSSlXJSQCLNEVLKAERPDVDEKK'rj 

IXKLQGEFQLRLRQLEKSLLQALNEVKGRILDDU 

TIiraENLKREAAEVTRKVEETDIVMQEVETVS 

QQYIJPLSTACSSIYFIMESLKQIHFLYQYSLQFFL 

DIYHNVLYENPNLKGVTDHTQRLSIITECDLFQVA 

FNRVARGMLHQDHTTFAMLLARIKLKGTVGEPT 

YDAEFQHFLRGNEIVLSAGSTPRIQGLTVEQAEA 

WRLSCLPAFKDLIAKVQADEQFGIWLDSSSPEQ 

TVPYLWSEETPATPIGQAIHRLLLIQAFRPDRLLA 

MAHMFVSTNLGESFMSIMEQPLDLTQIVGTEVKP 

NTPVLMCSVPGYDASGHVEDLAAEQNTQITSIAI 

GSAEGFNQADKAINTAVKSGRWVMLKNVHLAP 

GWLMQLEKKLHSLQPHACFRLFLTMEINPKVPV 

MJ,RAGRIFVraPPPGVKANMLRTFSSIPVSRICK 

SPNERARLYFLLAWFHAHQERLRYAPLGWSKKY 

EFGESDLRSACDTVDTWLDDTAKGRQNISPDKIP 

WSALKILMAQSIYGGRVDNEFIXJRLLNTFLERL 

FTTRSFDSEFKI^CKVIXJHKDIQMPDGIRREEFV 

QWVELLPDTQTPSWLGLPNNAERVLLTTQGVD 

MISKMLKMQMLEDEDDLAYAETEKKTRTDSTS 

IXjRPVAWMRTLHTTASNWLHLIPQTI^HLKRWE 

NIKDP1JTIFFE\REVXMGAKLLQ\DVRQDLADV\V 

QVCEGKKKQTNYIJITLI\NELV\KGILP\RSWSHY 
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SEQJDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A^Alanine OCystdne, D=Aspartic Add, 
E=GJutamic Add, F^henylalanlne, G=Glydne, B=Hlstidlne, 
Msoleodne, K~Lysine, L=Leudne, M=Methlonine, 
N^Asparagine, P^Prollne, Q=Glutamine, R^Arginine, S=Serine, 
T-Threonine, V»Vaiine, W»Tryptophan, Y^Tyrosine, 
X=Uoknown, *=Stop codon, A^posstble nudeotide ddetlon, 
V=possiblc nudeotide insertion 










TVPAGXMTVIQWGVPISARRIVKQLQNISLVAAASG 

GAKELKNMVCIXKjLFVPEAYITATRQYVAQAN 

SWSLEELCLEVNVTTSQGATLDACSFGVTGLKL 

QGATCNNNKl^LSNAISTALrl,TQIJlWVKQTOT 

EKKASVVTLPVYLNFTRADLIFIVDFEIATKEDPR 

SFYERGVAVLCTE 


3771 


A 


1 


2043 


LPLLHAGFNRRFMENSSIIACYNELIQIEHGEVRS 

QFKLRACNSVFTALDHCHEAIErrSDDHVIQYVN 

PAFERMMGYHKGELLGKELADLPKSDKNRADL 

LDTINTCIKKGKEWQGVYYARRKSGDSIQQHVKI 

TPVIG(^GKIRHFVSIJCKLCCTTDNNKQIHKIHR 

DSGDNSQTEPHSFRYKNRRKESIDVKSISSRGSDA 

PSLQNRRYPSMARIHSMTIEAPITKVINIINAAQEN 

SPVTVAEALDRVLEILRITELYSPQLGTKDEDPH 

TSDLVGGLMTDGLRM^GNEYVFTKNVHQSHSH 

lAMPITINDWPCISQIXDNEESWDFNIFELEAI'm 

KRPLVYIXjIJCVFSRFGVCEFLNCSETTLRAWFQ 

VIEANYHSSNAYHNSTHAADVLHATAFELGKER 

VKGSLDQLDEVAALIAATVHDVDHPGRTNSFL\C 

NAGSElAVLYhTOT\AV\LESHHTALAFQ\LTVKDT 

K\CNIFKNID/RG>niYRTLRQ 

FEHVNKFVNSINKPMAAEIEGSIX^CNPAGKNFP 

ENQILIKRMMDCCADVANPCRPLDLCIEWAGRIS 

EEYFAQTDEEKRQGLPWMPVFDRNTCSIPKSQI 

SHDYFITDMFDAWDAFAHLPALMQHLADNYKH 

WKTLDDLKCKSLRLPSDRLKPSHRGGLLTDKGH 

CESQ 


3772 


A 


1013 


50 

• 


TLVHADGFPSLHITETCLAYREKRIGIDLVHDTVE 

HELIKEAEnQGIMALLTRTLEEASEQIRMNRSAK 

YNIJEXDUODKFVALTO^ 

AVRIiii>N£VSTFPWUDFS^TN\TK\DKOT?NNSL ; 

:4LKALVD\KiI£QTA]mJ^Q^^^ 

KI)TKDARDQLADHLAKAVIV1EEIASQEKNITALEK 

AILDQEGP AKV AHTRLETRTHRPNVELCRDVAQ 

HILMKEVQEITHNVARIJCEIIAXQAQAELKGLH 

RRQLAIXJEEIQVKENTTYIDEVLCMQMRKSIPLR 

DGEDHGVWAGGLRPDAVC 


3773 


A 


1 


955 


AAARESERQ1JUJILCVLNEILGTERDYVGTLRFL 

QSAFLHRIRQNVADSVEKGLTEENVKVLFSNIEDI 

LEVHKDFLAALEYCLHPEPQSQHELGNVFLKFK 

DKFC^YEEYCSNHEKALRIXVEOOQPTVRAFLL 

SCMLLGGRKTTDIPLEGYL\LSPIQRICKYPLLLKE 

LAKRTPGKHPDHPAVQ\SALQAMKTVCSNINETK 

RQMEKLEALEAAA/QSHEEGWEGSNLTDICTQLL 

LQGTLLKISAGNIQERAFFLFDNLLVYCKRKSRV 

TGSKKSTKRIXSINGSLYIFRGRimEVMEVENVE 

DGTGSPSPSLA 


3774 


A 


4254 


2061 


ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLI 

RVDGKGSIKEIJOTGKQLEPLVAPLADGKVAVG 

QDDLTWLNEEGICTQKCALNWTDIPVAMEHQP 

PYIIAVLPRYVEIRTFEPRLLVQSIELQRPRFITSGG 

SNIIYVASNHFVWRLIPVPMATQIQQLLQDKQFE 

lALQIAEMKDDSDSEKQC^IHHIKNLYAFNLFC 

QKRFDESMQWAKLGTDPTHVMGLYPD1XPTDY 

RKQLQYPNPLPVLSGAELEKAHLALIDYLTQKRS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add resldne of 
peptide 
sequence 


Amino add sequence (A^Alanlne OCystdne, D=Aspartic Add, 
E=Glutamic Add, ^Phenylalanine, G-Grycine, H-Histidine, 
I»IsoIeudne, K»Lysine, L^Leodne, M&Methionlne, 
N-Asparaginc, P«Prollne, Q=Glutamine, R=Argtnlne, S=Seiinc, 
T-Threonine, V=VaIine, W=Tryptophan, Y«=Tyrosine, 
X=Un known, *=*Stop codon, /^possible nucleotide deletion, 
V=possible nudeotide insertion 










QLVKKOTOSDHQSSTSPIJvlEGTPllKSKKKIXQn 

DTTLIJCCYIJiTNV AL VAPIXR1JEKNHCHIEESEH 

VLKKAHKYSELELYEKKGLHEKALQVLVDQSK 

KANSPLKGHERTVQYLQHLGTENLHLIFSYSVW 

VLRDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 

FKGLAffYl^HIIHVWEETGSRFHNCLIQLYCEKV 

QGLMKEYLLSFPAGKTPVPAGEEEGELGEYRQK 

LLMFLEISSYYDPGRLICDFPFDGLLEERALLLGR 

MGKHEQAIJFIYVrflLKOTRMAEEYCHKHTO 

KDGNKDVYLSLLRMYLSPPSfflCLGPIKLELLEPK 

ANLQAAIXJVLELHHSKLDTTTCALNIXPANTQIN 

DIRIFLEKVLEENAQKKRFNQVLKNLLHAEFLRVX 

QEERIUlQQVKCnTEEKVCMVCKKKIGNSAFAR 

YPNGWVHYFCSXKEVNPADT 


3775 


A 


1832 


839 


MSRARGALCRACLALAAALAALLLLPLPLPRAP 

APARTPAPAPRAPPSRPAAPSLRPDDVFIAVKTTR 

KNHGPRLRLLIJITWXISRARQQTFIFTIXjDDPELE 

LQGGDRVINTNCSAVRTRQALCCKMSVEYDKFI 

ESGRKWFCHVDDDNYVNARSLLHLLSSFSPSQD 

WLGRPSLDHPIEATERVQGGRrVTTVKFWFAT 

GGAGFCLSRGLALKMSPWASLGSFMSTAEQVRL 

PDDCTVGY1VEGLLGARLLHSPLFHSHLENLQRL 

PPDTLLQQVTLSHGGPENPQNWNVAGGFSLHQ 

DPTRFKSIHCLLYPDTDWCPRQKQGAPTSR 


3776 


A 

■ 


3 


796 


PRAKLGTRAKNMAGQDAGCGRGGDDYSEDEGD 
SSVSRAAVEVFGKLKDLNCPFLEGLYITEPKTIQE 
LLCSPSEYRLEILEWMCTRVWPSLQDRFSSLKGV 
PTEVKIQEMTKLGHELMLCAPDDQELLKGCACA 
QKQLHFMDQLLDTIRSLTIGCSSCSSLMEHFEDT 
REKNEALLGELFSSPHLQMLLNPECDPWPLDMQ 
PLL^O^DDWQWASASA-'^riSP^KI *EL A .~ 
ESAAKLHAL'J BiTAQHEv^GAAAGAANTS. r' 


3777 


A 


3 


413 


SEEDVIEGKTAVIEKRRKKRSSAGWED/IGG£\ Q] 
NMLEGVGVDINKA11-AKRKRLEMYTKASLR7 i>2 T ! 
QKIEHVWKTQQDQRQKLNQEYSQQFLTLFQQW 
DLDMQKAEEQEEKILVGIMIRFIINQVSSRNGQPS 
LLL 


3778 


A 


132 


788 


SRLPPPPPHLADGRAGARVPRSARLSRWWVQD 

WTHGPIVRPPAAARTMWVNPEEVLLANALWITE 

RANPYFILQRRKGHAGDGGGGGGLAGLLVGTLD 

VVLDSSARVAPYRILYQTPDSLVYWTIACG\GSR 

KEriEHWEWLEQNIXQTLSIFENENDrriPVRGKI 

QGEAEYNKI^VKEDDDTEKFKEAIVKFHRLFG 

MPEEEKLVNYYSCSYWKG 


3779 


A 


2 


934 


CKSCTLFPQNPNLPPPSTRERPPGCKTVFVGGLPE 

NATEmiQEVFE(^GDITAIRKSKKNFCHIRFAEEF 

MVDKAIYLSGYRMRLGSSTDKKDSGRLHVDFA 

QARDDFYEWECKQRMRAREERHRRKLEEDRLR 

PPSPPAIMHYSEHEAAIJLAEKLKDDSKFSEAM\Q 

VLLSWIERGEVNRRVSANQFYSMVQSANSHVRRL 

MNEKATrlEQEMEEA^ 

AVFNASTRQKAWDH1^KAQRK^^D1WAK^HSEE 

LRNAQSEQl^GIRREEEMEMSDDENCDSPTKKM 

RVDESALGAP 


3780 


A 


1 


2535 


AAQAEREELAAGRMPGGGPQGAPAAAGGGGVS 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residne of 
peptide 
sequence 


Amino add sequence (A«Alanlne OCysteint, D»Aspartic Add, 
E=Glutamic Acid, F=-Pfaenyla!anlne, OGlycloe, HHBistidine, 
Msoleudne, K=Lysine, L=»Leucine, M"Methionine, 
N-Asparagine, P«Proline, Q=Glutamlne, R=Arginlne, S=»Serine, 
^Threonine, V^Vaiine, W^Tryptopnan, Y^Tyrosine, 
X=UnknowD, *=Stop codon, /=possible nudeotide deletion, 
V-possible nudeotide Insertion 










HRAQSKDCLPPAACFRRRRLARRPGYMRSSTGP 

GIGFI^PAVGTUTIFPGGVSGEESHHSESRARQC 

GLDSRGLLVRSPVSKSAAAPTVTSVRGTSAHFGI 

QLRGGTRLPDRLSWPCGPGSAGWQQEFAAMDS 

SETLDASWEAACSDGARRVRAAGSLPSAELSSNS 

CSPGCGPEVPPTPPGSHSAFTSSFSFIRLSLGSAGE 

RGEAEGCPPSREAESHCQSPQEMGAKAASLDGP 

HEDPRCLSQPFSLLATRVSADLAQAARNSSRPER 

DMHSLPDMDPGSSSSLDPSLAGCGGDGSSGSGD 

AHSWDTLLRKWEPVLRDCLLRNRRQMEVISLRL 

KLQKLQEDAVENDDYDKAETLQQRLEDLEQEKI 

SLHFQLPSRQPALSSFLGHLAAQVQAALRRGATQ 

QASGDDTHTPLRMEPRLLEPTAQDSLHVSITRRD 

WLLQEKQQLQKE1EALQARMFVLEAKDQQLRRE 

IEEQEQQLQWQGCDLTPLVGQLSLGQLQEVSKA 

LQDTLASAGQIPFHAEPPETIRSLQERIKSLNLSLK 

ETITKVCMSEKFCSTLRKKVNDIETQLPALLEAK 

MHAISGNHFWTAKDLTEEIRSLTSDREGLEGLLS 

KLLVLSSRNVKKLGSVKEDYNRLRREVEHQETA 

YETSVKEKIMKYMETLKNKLCSCKCPLLGKVW 

EADLEACRLLIQCLQLQEARGSLSVEDERQMDD 

LEGAAPPIPPRLHSEDKRKTPLKESYILSAELGEK 

CEDIGKK1XYLEDQLHTAIHSHDEDLIQSLRRELQ 

MVKETLQAMILQLQPAKEAGEREAAASCMTAG 

VHEAQA 


3781 


A 


3 


995 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 

TEKRPFIDEAJCRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

YPQiic ^^UvGAAQMQPxvlHRYDVSALQYNSM 

TSSQTYRiNG/SRPTYSMSYSQQGTPGMAPGSVMG 

SWKSEASSSPPWTSSSHSRAPCQAGDLRDMIS 

MYIJ^AEWRPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3782 


A 


1 


2649 


FRVPDSCPWLHSFTQLDPDLPRPESSTQEIGEEU 

NGVIYSISLRKVQLHHGGNKGQRWLGYENESAL 

NLYKTCKVRTVKAGTLEKLVEHLVPAFQGSDLS 

YVTIFIXTYRAFTTTQQVIJ^LLFKRYGRCDALTA 

SSRYGCILPYSDEDGGPQDQLKNAISSILGTWLD 

QYSEDFCQPPDFPCLKQLVAYVQLNMPGSDLER 

RAHL1XAQLEHSEPIEAEPEGEEDWALSPVPALK 

PTPELELALTPARAPSPVPAPAPEPEPAPTPAPGSE 

LEVAPAPAPELQQAPEPAVGLESAPAPALELEPA 

PEQDPAPSQTLELEPAPAPVPSLQPSWPSPWAEN 

GI^EEKPHLLVFPPDLVAEQFTLMDAELFKKVVP 

YHCLGSIWSQRDKKGKEHLAP1TRATVTQFNSV 

ANCVITTXXGNRSTKAPDRARVYEHWIEVAREC 

RILKNFSSLYAILSALQSNSIHRLKKTWEDVSRDS 

FRIFQKLSEEFSDENNYSLSRELLIKEGTSKFATLE 

MOTKJUQKRPKETGnQGTVPYLGTFLTDLVML 

DTAMKDYLYGRLD^KRRKEFEVIAQIKLLQSA 

CNNYSIAPDEQFGAWFRAVERLSETESYNLSCEL 

EPPSESASNTLRTKKNTAIVKRWSDRQAPSTELS 



434 



10/30/2006, EAST Version: 2.0.3.0 



WO 01757190 PCT/US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanJne OQysteine, D=Aspartic Add, 
E<=GlntamJc Add, F^Pbenylalanine, G^Grydne, H=HIstfdine, 
I=Isoleudne, K^Lysine, L=Leudne, M™Methlonlne, 
N^Asparagine, P^Proline, Q^GIutaminc, R^Arginine, S=Serine, 
T«Tnreonine, V^Vallne, W-Tryptophan, Y-Tprosine, 
X»Unknown, ^^top codon, A=possible nudeotide deletion, 
\=possib]e nudeotide insertion 










TSGSSHSKSCDQLRCGPYLSSGDIADALSVHSAG 

SSSSDVEEINISFVPESPDGQEKKFWESASQSSPET 

SGISSASSSTSSSSASTTPVAATRTHKRSVSGLCNS 

SSALPLYNQQVGDCCIIRVSLDVDNGNMYKSILV 

TSQDKAPAVIRKAMDKHNLEEEEPEDYELLQILS 

DDRKLKIPEN ANVFY AMN STAN YDFVLKKRTFT 

KGVKVKHGASSTLPRMKQKGLKIAKGIF 


3783 


A 


3 


869 


RSGQGKVYGUGRRRFQQMDVT^GLNLUTISGK 

Rhn^RVYYI^WIJ(NKILHNDPEVEKKQ 

GDMEGCGHYRVVKYERIKFLVIALKSSVEVYAW 

APKPYHKFMAFKSFADLPHRPLLVDLTVEEGQR 

LKVIYGSSAGFHAVDVDSGNSYDIYIPVfflQSQIT 

PHAHtIJOTIXjMEMIXC^DEGVYVN^ 

DWI^WGEMPTSVAYICSNQIMGWGEKAIEIRS 

VETGHLIX3VFMHKRAQRLKFLCERM)KVFFASV 

RSGGSSQVYFMTLNRNCEMNW 


3784 


A 


12J3 


457 


LSPRQVDGLAGLQKGLSLSLLYQFLMNG1RLGTY 

GLAEAGGYLHTAEGTHSPARSAAAGAMAGVMG 

AYLGSPIYMVKTHLQAQAASEIAVGHQYKHQG 

MFQALTEIGQKHGLVGLWRGALGGLPRVIVGSS 

TQIXTFSSTKDLLSQWEIFPPQSWKLALVAAMM 

SGIAWLAMAPFDVACTRLYNQPHRCTGQGPVLY 

RGILDALLQTARTEGIFGMYKGIGASYFRLGPHTI 

LSLFFWDQLRSLYYTDTK 


3785 


A 


193 


813 


rrrgrhslcggkmlaycvqdatwdvekrrnp 

skhywn>tvtwsdstsqtiyrry\skffdlqmql 

ij)\kfpi\esgqkdpkqriipflpgkilfrrshirdv 

avkrlkpideycralvrlpphisqcdevfrffear 

pedvnppkeqgpsppdavlpygvnkgkqelkag 

pnwpgrthhvvncvtqkclfvfh^ 

sk; 1 ^ - 


V786 


A 


3785 


1632 


EFV GRAASTTVVTRl; WRm>i£)AGIRRWPSDLY 

PLVLGFLRDNQLSEVA? JCFAKATGATQQDANAS 

SLLDIYSFWLNRSAKWEiUaOANGPVAKKAKK 

KASSSDSEDSSEEEEEVQGPPAICKAAVPAKRVGL 

PPGKAAAKASESSSSEESSDDDDEEDQKKQPVQ 

KGVKPQAKAGQAPPKKAKSSDSDSDSSSEDEPP 

KNQKPK1TP\VTVKAQTKAPPKPARA\APKIANGK 

AASSSSSSSSSSSSDDSEEEKAAATPKKTVPKKQV 

VAKAPVKAATTFIT^SSSSEDSSSDEEEEQKKPM 

KNKPGPYSSVPPPSAPPPKKSLGTQPPKKAVEKQ 

QPVESSEDSSDESDSSSEEEKKPPTKAVVSKATTK 

PPPAKKAAESSSDSSDSDSSEDDEAPSKPAGTTK 

NSSNKPAVTTKSPAVKPAAAPKQPVGGGQKLLT 

RKADSSSSEEESSSSEEEKTKKMVATTKPKATAK 

AALSLPAKQAPQGSRDSSSDSDSSSSEEEEEKTSK 

SAVKKKPQKVAGGAAPSKPASAKKGKAESSNSS 

SSDDSSEEEEEKLKGKGSPRPQAPKANGTSALTA 

QNGKAAKNSEEEEEEKKKAAVVVSKSGSLKKR 

KQNEAAKEAETPQAKKIKUJTPWW 

RASSPFRRVREEEIEVDSRVADNSFDAKRGAAGD 

WGERANQVUOnTKGKSFRHEKTKKKRGSYRGG 

SISVQVNSIKFDSE 


3787 


A 


3 


5078 


IPEG/RAI^AEHTSSLVPSLHITTLGQEQAE,SGAV 
PASPSTGTADFTSILTF1.QPTENHASPSPWEMPTL 



435 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCIYUS01/04098 



SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine 0*Cys trine, D=Aspartic Add, 
E=Glutamtc Add, ^Phenylalanine, G=Glycine, ENHlstidine, 
I=Isoleudne, KMLysine, L^Leudne, M=Methlonine, 
N=Asparaglne, P=Proline, Q-Glutamlne, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nudeotide ddetfon, 
V=possibJe nndeotide Insertion 


; 

i 




* 




PAEGSDGSPPATRDLLLSSKVPNLLSTSWTFPRW 

KKDSVTAILGKNEEANVTIPIX^ 

VNGFVSDFSTGSVSSPIITAPRTNPLPSGPPLPSILS 

IQATQTVFPSLLAFSSTKPEVYAAAVDHSGLPAS 

APKQVRASPSSMDVYDSLTIGDMKKPATTDVFW 

SSLSAETGSLSTESnSGLQQQTNYDLNGHTISTTS 

WETHLAPTAPPNGLTSAADAJKSQDFKDTAGHS 

VTAEGFSIQDLVLGTSffiQPVQQSDMTMVGSHID 

LWPTSNNNHSRDFQTAEVAYYSPTTRHSVSHPQ 

LQLPNQPAHPLLLTSPGPTSTGSLQEMLSDGTDT 

GSEISSDINSSPERNASTPFQNILGYHSAAESSISTS 

VFPRTSSRVLRASQHPKKWTADTVSSKVQPTAA 

AAVTLFLRKSSPPALSAALVAKGTSSSPLAVASG 

PAKSSSMTTLAKWTNKAASGPKRTPGAVHTAF 

PFIITYMYARTGHTTSTHTA/IARKHGHCLWPVV 

YNU>/PP/GKPQAMrnX3LPNPTNI^MPRASTPRPL 

TVTAALTSITASVKATRLPPIJEIAENTDAVLPAAS 

AAVVTTGKMASNLECQMSSKLLVKTVLFLTQRR 

VQISESLKFSIAKGLTQALRKAFHQNDVSAHVDI 

LEYSHmOVGYYATKGKLVYLPAVVIEMLGVY 

GVSNVTADLKQrTIPHLQSVAVLASPWNPQPAG 

YFQLKTVLQFVSQADNIQSCKFAQTMEQRLQKA 

FQDAERKVLNTK5NLTIQIVSTSNASQAVTLVYV 

VGNQSTFLNGTVASSLLSQLSAELVGFYLTYPPL 

TIAEPLEYPNIX)ISETTRDYWVnVLQGVDNSLV 

GLHNQSFARVMEQRLAQLFMMSQQQGRRFKRA 

TTLGSYTVQMVKMQRVPGPKDPAELTYYTLYN 

GKPLLGTAAAKILSTIDSQRMALTLHHWLLQAD 

PWKNPPNM.WIIAAV1J^IAVVTV^ 

KKKNDFKPDTMINLrKJRAKPVCXjFDYAKQHLG 

QQGADEF\TPVTQ!:TVVXPL; " D4pOE v nVA\^ 

GSmTAICSTCITv^ ^;^SPSEhfGSVISNESGKPSC^ < i 

RSPQNVMAQQKVTKEEARKRNVPASDEEEGAV ; 

LrTDNSSKVAAEPFDTSSGSVQLIAIKPTALPMVPP 

TSDRSQESSAVLNGEVNKAIXQKSDIEHYRNKL 

ROCAKRKGYYDFPAVETSKGLTERKKMYEKAPJ 

KEMEHVLJDPDSELCAPFTESKNRQQMKNSVYRS ' 

RQSLNSPSPGETEMDLLVTRERPRRGIRNSGYDT 

EPEIIEETNDDRVPEPRGYSRSRQVKGHSETSTLSS 

QPSIDEVRQQMHMLLEEAFSLASAGHAGQSRHQ 

EAYGSAQHLPYSEWTSAPGTMTOPRAGVQWVP 

TYRPEMYQYSLPRPAYRFSQLPEMVMGSPPPPVP 

PRTGPVAVASLRRSTSDIGSKTRMAESTGPEPAQ 

IJHDSASFTQMSRGPVSVTQLIXJSALNYSGNTVP 

AVFAIPAANRPGFTGYFIPTPPSSYRNQAWMSYA 

GENELPSQWADSVPLPGYDSAYPRSRYPQSSPSRL 

PRQYSQPANLHPSLEQAPAPSTAASQQSLAENDP 

SDAPLTOISTAALVKAIREEVAKLAKKQTDMFEF 

QV 


3788 


A 


2 


1737 


MKGLYTDAEMKSDNVKDKDAKISFLQKAIDW 

VMVSGEPLLAKPARIVAGHEPERTNELLQIIGKC 

CLNKLSSDDAVRRVLAGEKGEVKGRASLTSRSQ 

ELDNKNVREEESRVHKNTEDRGDAEDCERSTSRD 

RKQKEELKEDRMPREKDKDKEKAKENGGNRHR 

EGERERAKARARPDNERQKDRGNRERDRDSERK 
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StlQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, D=Aspartic Add, 
E=Glutamlc Add, ^Phenylalanine, G^Glydne, H=Hlstidine, 
I=Isoleudne, K=Lysine, L=Leucine» M=Methlonine, 
N«Asparagtne, P=ProJine, Q°G)utamlne, R^Argtnine, S=Serine, 
T«Threonine, V«Valine, W«Tryptophan, Y^Tyrosint, 
X-Unknown, *=Stop codon, /^possible nudeotide deletion, 
\=pcssiblc nudeotide insertion 










KETERKSEGGKEKERLRDRDRERDRDKGKDRDR 

RRVKNGEHSWDLDRENNREHDKPEKKSASSGE 

MSKKLSDGTFKDSKAETETEISTRASKSLTTKTS 

KRRSKNSVEGDSTSDAEGDAGPAGQDKSEVPET 

PEIPNELSSNIRRIPRPGSARPAPPRVKRQDSMEAL 

QMDRSG SGKTVSNVITESHNSDNEEDDQFWEA 

APQLSEMSEIEMVTAVEIJEEEEKHGGLVKKILET 

KIODYEKXC^SPKPGEKERSIJESAWKKEKDIVS 

KEIEKLRTSIQTIXKSAI^LGKIMDYIQEDVDAM 

QNELQM\YHSENRQHAEALQQEQRITDCAVEP\L 

KAE1A\E1XQLIKD\Q\QDKICAVKANILKNEEKIQ 

KMVYSINLTSRR 


3789 

I 


A 


1 


4369 


MRTLGTCLATLAGLLLTAAGETFSGGCLFDEPYS 

TCGYSQSEGDDFNWEQVNTLTKPTSDPWMPSGS 

FMLVNASGRPEGQRAHLIXPQLKENDTHCIDFH 

YFVSSKSNSPPGIXNVYVKVNNGPLGOTIWNISG 

DPTRTWNRAELAIS1TWPNFYQVIFEVITSGHQG 

YLAIDEVKVLGHPCTRTPHFLRJQNVEVNAGQFA 

TF(^SAIGRTVAGDRLWLQGIDVRDAPLJCEIKVT 

SSRRFIASFNVVNTTKRDAGKYRCMI\RTEGGVGI 

SNYAELWVKEPPVPIAPPQLASVGATYLWIQLN 

ANSINGDGPIVAREVEYCTASGSWNDRQPVDSTS 

YKIGHLDPDTBYEISVLLTRPGEGGTGSPGPALRT 

RTKCADPMRGPRKLEVVEVKSRQITIRWEPFGY 

NVTRCHSYNLTVHYCYQVGGQEQVREEVSWDT 

ENSHPQHTITh^SPYTNVSVKIJLMOTEGRKESQ 

ELIVQTDEDLPGAVPTESIQGSTFEEKIFLQWREP 

TQTYGVITLYEITYKAVSSFDPEIDLSNQSGRVSK 

LGNETHFLFFGLYPGTTYSFC1RASTAKGFGPPAT 

NOmXISAPSMPAYELETPLNQTDNTVTVMLKP 

An;a&\r's\^ r Qn/vr™ 

VP^IFQNASLLNSQ v VVAA^'PADSLQAAQPFIIG 

DNKTYNGYWNTPLLP\ iCS^J'.FQAASRANGET 

KIDCVQVATKGAATPKP v r&EKQTDHTVKIAG 

VIAGIIXFVIIFLGVVLVMKKRKLVAKKRKETM 

TRQEIDLWIGELNGPRSYAEQGTKLATRAFSFMD 

THNLNGRSVSSPSSFTMKTNTLSTSVPNSYYPDE 

1KIMASDTSSLVQSHTYKKREPADVPYQTGQLH 

PAIRVADLLQHTTQMKCAEGYGFKEEYESFFEGQ 

SAPWDSAKKDENRMKNRYGNHAYDHSRVRLQT 

IEGDTNSDYINGNYIIXjYHRPNHYIATQGPMQET 

IYDFWRMVWHEOTASIIMVTNLVEVGRVKCCK 

YWPDDTEIYKDIKVTLffiTEIlJVE^ 

GVHEIREIRQFHFTGWPDHGVPYHATGLLGFVR 

QVKSKSPPSAGPLWHCSAGAGRTGCFIVIDIML 

DMAEREGVVDIYNCVRELRSRRVNMVQTEEQY 

WIHDAILEACLCGDTSVPASQVRSLYYDMNKLD 

PQTNSSQIKEEFRTLNMVTPTLRVEDCSIALLPRN 

HEKNRCMDELPPDRCLPFLITEDGESSNYINAALM 

DSYKQPSAFIVTQHPLPNTVKDFWRLVLDYHCTS 

VVMLNDVDPAQLCPQYWPENGVHRHGPIQVEF 

VSADLEEDESRIFRIYNAARPQDGYRMVQQFQFL 

GWPMYRDTPVSKRSFLXLIRQVDKWQEEYNGG 

EGRTVVHCLNGGGRSGTFCAISIVCEMLRHQRTV 

DWHAVKTIJINNKPNMVDLLIXJYKFCYEVAI^ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nudeotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystrine, D=Aspartie Add, 
E=Glutamlc Add, ^Phenylalanine, G=dydne, HHHistidlne, 
Msolendne, K^Lysine, L=Lendne, M=Merhlonlne, 
N^Asparagine, P=Proline, Q=GIutamlne, R»Arginlne, S=Serinc, 
^•Threonine, V-Valine, W-Tryptophan, Y=Tyrosiuc, 
X=Unbnown» *=Stop codon, /^possible nudeotide ddetion, 
V=possiWe nudeotide Insertion 










YLNSG 


3790 


A . 


261 . 


485 


EEQTPU1IASRLGKTEIVQL1XQHMAHPDAATTN 
GYlPIJnSAREGQV\DV\ASVLLGRQGAAHSFRLT 
KVRRMTS 


3791 


A 


1 


5874 


LPPVTMSGKYIMEEHDSYSDQVWSE>ELPSKQG 

YYL(^hfYLRCVAEVGSFEHNLTTOLLNHLVFVQ 

KVFMKEVNEVIQKVSGQEQPIPLWNEHDGTADG . 

DKPKILLYSLinXJFKGIQVTATTPSMRAVRFETG 

LIELELSWRLQTXASPGSSSYLKLFGKCQVDLNL 

ALGQIVKHQ V YEE AG SDFHQV A YFKTRIGLRN A 

LREEISGSSDREAVLITLNRPIVYAQPVAFDRAVL 

FWLNYKVAAYDNW^QRMALHKDIHMATKEVV 

DMLPGIQQTSAQAFGTPFLQLTVNDLGICLPITNT 

AQSNHTGDLDTGSALVLTIESTUTACSSESLVSK 

GHFKNFCIRFADGFETSWDDWKPEIHGDLVMNA 

CVVPDGTYEVCSRTTGQAAAESSSAGTWTLNVL 

WKMCGIDVHMDPMGKRLNALGNTLTTLTGEED 

IDDIADLNSVhnADLSDEDEVDTMSPTlHTEATDY 

RRQAASASQPGELRGRKEMKRIVDIRELNEQAKV 

IDDLKKLGASEGTINQEIQRYQQLESVAVNDIRR 

DVRKKLRRSSMRAASLKDKWGLSYKPSYSRSKS 

ISASGRPPLKRMERASSRVGETEELPEIRVDAASP 

GPRVOTNIQDTFPEETELDLLSVTIEGPSHYSSNSE 

GSCSVFSSPKTPGGFSPGIPFQTEEGRRDDSLSSTS 

EDSEKDEKDEDHERERFYIYRKPSHTSRKKATGF 

AAVHQLFTERWPTTPVNRSLSGTATERNIDFELD 

IRVEIDSGKCVLHPTTLLQEHDDISLRRSYDRSSR 

SLIXJDSPSKJOCKFQT^ASTimMTGKKVPSSL 

QTKPSDLETTVFYIPGVDVKLHYNSKTLKTESPN 

ASRGSSLPRTLSKESKLYGMKDSATSPPSPPJ PST 

VQ^KTI^TXPPQPPPIPA/IIGKGSCGVKTAK _ 1 A 

WVALQSLPEEMVISPCLLDFLE. ^ITPITPVER 

NYTAVSSQDEDMGHFEEPDPMEES\TTSLVS\SSTS 

AYSSFPVDVVVYVRVQPSQIKFSCLPVSRVECML 

KLPSIJ)LVFSSNRGELETLGTTYPAE'ELSPGGNA 

TQSGTKTSASKTGIPGSSGLGSPLGRSRHSSSQSD 

LTSSSSSSSGLSFTACMSDFSLYVFHPYGAGKQIT 

AVSGLTPGSGGLGNVDEEPTSVTGRKDSLSINLE 

FVKVSI^RIRRSGGASrTvESQSVSKSASKMDTTLI 

NISAVCDIGSASFKYDMRRLSEILAFPRAWYRRSI 

ARRLF1X}IX?TIN1J?TCGPGTPDSIEGVSQHLSPESS 

RKAYCKTWEQPSQSASFTHMPQSPWFNEHMTN 

STMSPGTVGQSLKSPASERSRSVSDSSVPRRDSLS 

KTSTPFNKSNKAASQC^TPWETLVWAINLKQL 

NVQMNMSNVMGOTTWTTSGLKSQGRLSVGSNR 

DREISMSVGLGRSQLDSKGGWGGTIDVNALEM 

VAfflSEHPNQQPSHKIQITMGSTEARVDYMGSSEL 

MGEFSNADLKLQDEWKVNLYNTLDSSITDKSEIF 

VHGDUCWDIFQVMISRSTTPDLIKIGMKLQEFFT 

QQFDTSKRAl^TWGFWYIJPKTMTSNLEKSSQE 

QLLDAAHHRHWPGVLKVVSGCHISLFQIPLPEDG 

MQFGGSMSLHGNHMTLACFHGPNFRSKSWALF 

HLEEPhnAFWTEAQKIWEIXjSSDHSTYWQTLDF 

HLGHNTMVTKPCGALESPMATITKITRRRHENPP 

HGVASVKEWFKYVTATRNEELNLLRNVDANNT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A»Alaninc OCystcJne, D«Aspartic Add, 
E^GIutamlc Add, ^.Phenylalanine, (^Glycine, H=Hlstidine, 
I=Isoleudne, KpLysine, LpLeudne, M= I MethionJne, 
N=Asparagine, P-ProAine, Q°G!utamine, K»Argiiiine, S=$erine, 
T=*Thrconlne, V-Valine, W»Tryptopban, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible ondeotidc ddetion, 
\=possible audeotide insertion 










ENSTTVKNSS1XSGFRGGSSYNHETETIFALPRM 

QLDFKSIHVQEPQEPSLQDASLKPKVECSVVTEF 

TDfflCVTMDAELIMFLHDLVSAYLKEKEKAIFPP 

RILSTRPGQKSPniHDDNSSDKDREDSITYrrVDW 

RDFMCOTWHIJEPILRLISWTGRKmPVGVDYII^ 

KIXJFHHARTTIPKWLQRGVMDPLDKVI^VIJKK 

LGTALQDEKEKKGKDKEEH 


3792 


A 


I 


364 


QNGSTPLHHAASKNRHEIALMLLEGGANPDGKD 
HYEATAKHQATAKGNFKMIHILLYYKASTnQDT 
EGNTPPHLVCD\RVEEAKLLVSQGA/SIYIENKEE 
KDP/LQVAKGALGLVLKRMVEG 


3793 


A 


2 


340 


DIVPNPKMAPLGDEAPTLEKVLTPELSEEEVSTR 
DDIQFHHFSSEEALQKVKYFVAKEDPSSQEEAHT 
PEAPPPQPPSSERCLGEMKCTLVRGDSSPRQAEL 
KSGPASRPAL 


3794 


A 


421 


158 


SYWVGEDYTYKFFEVILIDPFHKAIRRJs^DTQWI 
SKAVYKHREMCGLTSTGRKSHGLEKDRMFPHAI 
GGSCRAA*RRRKTLQFPCYH 


3795 


A 


24 


592 


GGMDSRVSGTTSNGETKPVYPVMBKKEEDGTLE 

RGHWNNKMEFVLSVAGEnGLGNVWRFPYLCYK 

NGGGAFFIPYLVFLFTCGIPVFLLETALGQYTSQG 

GVTAWRKICPIFEG1GYASQMIVILU4VYYIIVLA 

WALFYLFSSFTIDLPWGGCYHEWNTEHCMEFQK 

TNGSLNGTSENATSPVIEFW 


3796 


A 


3 


592 


KPASTYSTSQPSMAPLLPIRTLPL1LILLALLSPGA 

ADFNISSLSGLLSPALTESLLVALPPCHLTGGNAT 

LMVRRANDSKVVTSSFVVPPCRGRRELVSVVDS 

GAGFTVTRLSAYQVTNLVPGTKFYISYLVKKGT 

ATESSREIPMFTLPRRNMESIGLGMARTGGMVVI 

TVLLSVAMFI 7 VLGFIIALALGSRK 


375/ ' 




i 


1556 


AT^LIRGSGS .^C3 .^GPPA • : GGAYPN 

IFI -JPLPGWKJ>WAT^ 

RVASQNKFGQFCTVGILINSGSRYf, AKYLSGIAH 

FI^KIAFSSTARFDSKDEIIXTLEKHGG^^CQTS 

RDTTMYAVSADSKGLDTWALLADWLQ?RLT 

DEEVEMTRMAVQFEIXDLNLRPDPEPLLTEMIHE 

AAYRENTVGLHRFCTTEWAKINREVLHSYLRN 

YYTPDRMVLAGVGVEHEHLVDCARKYLLGVQP 

AWGSAEAVDIDRSVAQYTGGIAKLERDMSNVSL 

GPTPIPELTHIMVGLESCSFLEEDFIPFAVLNMMM 

GGGGSFSAGGPGKGMFSRLYLNVLNRHHWMYN 

ATSYHHSYEDTGLLCIHASADPRQVREMVEIITK 

EFILMGGTVDTVELERAKTQLTSMLMMNLESRP 

VIFEDVGRQVLATRSRKLPHELCTLIRNVKPEDV 

KRVASKMLRGKPAVAALGDLTDLPTYEHIQTAL 

SSKDGRLPRTYRLFR 


3798 


A 


73 


759 


KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 

QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 

I^WFLNDRPNIKCTKGGIjVAYSTSVNLTSDGQV 

LASRFMAYHKFLKNSQDYTEALRAARELAANTT 

ADLRKVPGTDPAFEVFPYTITNVFV^YLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MILVDTVGFMALWGISYNAVSLINLVS 


3799 


A 


73 


759 


KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 
QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«=AIantoe CVCysteine, D=Aspartic Add, 
E*=GIutamic Add, ^Phenylalanine, OGlydne, H=£Gstidine, 
t=Isoleucine, KpLysine, LHLeudne, M=Methionlne, 
N=Aspanigine,P«ProItoe, Q=€tatamine, R=Arginine, S«Serlne, 
^Threonine, V-Valine, W-Tryptopfaan, Y^Tiroslne, 
X=Unknown, *=Stop codon, /=possib!e nudeotidc deletion, 
V=possibIe nudeotide insertion 










LPWFLNDRPN1KCPKGGLAAYSTSVNLTSDGQV 

I^SRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADUUCWGTDPAFEVFPYTTI^^ 

LfMl^lXXVPTFAV 

MILVDTVGFMALWGISYNAVSLINLVS 


3800 


A 


250 


1032 


GIFRSLRVLFPLFSVGRPQFARSLSAAPQLSDTAD 

TMGFGDLKSPAGLQVLNDYLADKSYIEGYVPSQ 

ADVAVFEAVSSPPPADLCHALRWYNHIKSYEKE 

KASLPGVKKALGKYGPADVEDTTGSGATDSKD 

DDDEDLFGSDDEEESEEAKRLREERLAQYESKKA 

KKPALV AKS SILUD VKP WDDETDMAKLEEC VRS 

IQAIX5LVWGSSKLWVGYGIKKLQIQCVVEDDK 

VGTDMLEEQITAFEDYVQSMDVAAFMCI 


3801 


A 


155 


656 


SREMELVTFRDVAIEFSPEEWKCLDPAQQNLYR 

DVML^NYR^VSLGFVISNPDLVTCLEQIKEPCN 

OCIHETAAKPPMCSPFSQDl^PVQGM^SFHKLIL 

KRYEKCGHENLQLRKGCKRVNECKVQKGVNNG 

VYQCl^TTQSKIFQCOTCVRWSTSSHSNKHK 


3802 


A 


1 


1428 


vtvspethmdltkgcvtfediaiyfsqdewglld 

eaqrllylevmlenfalvaslgcghgtedeetp 

sdqnvsvgvsqskagsstqktqscemcvpvlkd 

ilhiadlpgqkpylvgecmhhqhqkhhsakks 

lkromdrasyvko:lfcmslkpfrkwevgkdl 

pamlrllrslvfpggkkpgtitecgedirsqksh 

yksgecgkasrhkhtpvyhprvytgkklyecsk 

cgkafrgkyslvqhqrvhtgerpwecnecgkf 

fsqtshlndhrrihtgerpyecsecgklfrqnss 

lvdhqkihtgarpyecsqcgksfsqkatlvkhq 

rvhtgerpykcgecgnsfsqsailnqhrrihtga 

kpyecgqcgksfsqkatlikhqrvhtgerpykc 

GDCuF^^SQSSILIQHn?^7rGA?J?yECGQC J 'S? 
SQKSGLIQHQWHTGERPYECK • CuNSFSQCSSL 
IHHQKCHNT 


3803 


A 


193 


617 


LFPFLGSESKNGEADSSDKEMKHGQKSPTGKQTS 

QHLKR1JOISGLGH1JCWTKAEDIDIBTPGSILVNT 

NLRALINKHTFASLPQHFQQYLLLLLPEVDRQMG 

SDGILRLSTSALNNEFFAYAAQGWKQRLAEGKF 

VFSIIM 


3804 


A 


197 


479 


SSSRASPPEHPSSQAHCGPLVLSHACPEVTNKWS 
TGSSSSPNSSWVSSPLQPEGLSGSSRMKGGSATKI 
LLETLLLAAHMTADQGIASSQRCLL 


3805 


A 


1 


385 


QSADTLFPGDINFNVSGLFSAVTLQDTVSDRLAS 
EELPSTAVPTPATTPAPAPAPAPATAPALVSAAT 
KERTESEVPPRPASPKVTRSPPETAAPVEDMARR 
SELAVGGEEGTEGGRGEGTGSPMSSY 


3806 


A 


47 


1033 


LQGDTWHLSFLSHFSRLHGGVPGRGLLEGNLLQ 

PQAPGHDMTSIPFPGDRLLQVDGVILCGLTHKQA 

VQCLKGPGQVARLVLERRVPRSTQQCPSANDSM 

GDERTAVSLVTALPGRPSSCVSVTDGPKF*SSN* 

KJUANGLXjFSFVQMEKESCSHLKSDLVRIKRLFP 

GHPAEENGAIAAGDIILGREWEGPRKASSSRCRG 

SWAMQLSVQAGPSFASYYPAAVEVLHLLRGAPQ 

EVTLLLCRPPPGALPELEQEWQTPELSADKEFTR 

ATCIDSCTSPILGSRGQLGGTVPPQMQGKAWGL 

RPESSQKAIREGTMGAKTERDLGPVP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A= Ala nine OCystelne, D=Aspartic Add, 
E"Glutamic Add, ^Phenylalanine, G=Glydne, H=Histidine, 
t=Isoleudne, K=Lyslne, L^Lcudne, M=Methionlne, 
N=Asparagine, P«ProIine, Q=Glatamlne, R=ArgInine, S=Serine, 
"^Threonine, V=Valine, W^ryptophan, Y^Tyrosine, 
X=Unknown, *^=Stop codon, /^possible nudeotide deletion, 
V=possib)e nudeotide insertion 


3807 


A 


656 


1238 


RCPSLLPPSWPIJPTLQ1LTR1TGNKAIAGGAGLW 

AVLWGSERTPPYR*GN*NQRGAVPCLRPHRLRP 

QDKFLVLASDGLWDMLSNEDVVRLVVGHLAEA 

DWHKTDLAQRPANLGLMQSLLLQRKASGLHEA 

DQNAATRLIRHAIGNNEYGEMEAERLAAMLTLP 

EDI^RMYRDDITVTVVYFNSESIGAYYKGG 


3808 


A 


26 


2195 


SQYSESVAGRQASPERLLGSYHAMASTVEGGDT 

ALLPEFPRGPLDAYRARASFSWBCELALFTEGEG 

MLRFKKTTFSALENDPLFARSPGADLSLEKYREL 

NFLRCKRIFEYDFLSVEDMFKSPLKVPALIQCLG 

MYDSSLAAKYLLHSLVFGSAVYSSGSERHLTYIQ 

K1FRMEIFGCFALTELSHGSNTKAIRTTAHYDPAT 

EEFIIHSPDFEAAKFWVGNMGKTATHAVVFAKL 

CVPGIX^CHGLHPFIVQIRDPKTLLPMPGVMVGDI 

GKKLGQNGLDNGFAMFHKVRVPRQSLLNRMGD 

VTPEGTTVSPFKDVRQRFGASLGSLSSGRVSIVSL 

AILNLKLAVAIALRFSATRRQFGPTEEEEEPVLEY 

PMQQWRLLPYLAAVYALDHFSKSLFLDLVELQR 

GLASGDRSARQAELGREIHALASASKPLASWTT 

QQGIQECREACGGHGYLAMNRLGVLRDDNDPN 

CIYEGDNNILLQQTSNYIXGLIAHQVHDGACFR 

SPLKSVDFLDAYPGILDQKFEVSSVADCLDSAVA 

LAAYKWLVCYLLRETYQKLNQEKRSGSSDFEAR 

NKCQVSHGRPLALAFVELTVVQRFHEHVHQPSV 

PPSLRAVLGRLSALYALWSLSRHAALLYRGGYF 

SGEQAGEVLESAVLALCSQLKDDAVALVDVIAP 

PDFVLDSPIGRADGELYKNLWGAVLQESKVLER 

ASWWPEFSVNKPVIGSLKSKL 


3809 


A 


117 


830 


CFGIMERVGCTLTTTYAHPRPTPTNFLPAISTMAS 
S YRDRFPHSNI THSLSLP WRPSTYYKVASNSP5! V 

apyctrsqrv; r.mhff.VF'- T ^T^\'^:rrPD£^v ; 

YRJ t~L ; rNYQE^NTSRHNSEKLR v OiSiU-i >DKYQ 
QTRKTQADTTQNLGERVNDIGF^^ fvEDHELDEM 
IGE1KALTDVKKRIJERALMETEAPLQV, *RECLF 
HREKRMGIDLVHDEVEAQLLTVNVGEMIIQSQA 
A 


3810 


A 


3 


518 


VIQELEGGSGADLGEHSCRPASQPRFPRPAEARS 
HPATRRPASGPAMGKTNSKLAPEVLEDLVQNTE 
FSEQELKQWYKGFOCDCPSGILNLEEFQQLYIKF 
FPYGDASKFAQHAFRTFDKNfGDGTlDFREFICAL 
SVTSRGSFEQKLNWAFEMYDLIXjIXjRITRLEML 

eiie 


3811 


A 


81 


1147 


gcgygcsgaggaaigepmakwgegdprwivee 
radatnvnnwhwterdasnwstdklktlflav 

QVQNEEGKCEVTEVSKLIXjEASINNRKGKLIFFY 

EWSVKLNWTGTSKSGVQYKGHVEIPNLSDENSV 

DEVEISVSLAKDEPDTOLVALMKEEGVKLLREA 

MGIYISTLKTEFTQGMILPTMNGESVDPVGQPAL 

KTEERKAKPAPSKTQARPVGVKIPTCKITLKETFL 

TSPEELYRVFTTQELVQAFTHAPATLEADRGGKF 

HMVDG WSGEFTDL WEKHIVMK WRFKS WPEG 

HFATTTLTFIDKNGETELCMEGRGIPAPEEERTRQ 

GWQRYYFEGIKQTFGYGARLF 


3812 


A 


20 


558 


PCGTAASTHAYDREAKCRQQQQQQQNGGQNKV 
RPAKKKTSPAREVSSESGTSGQFTPPSSTSVPTIAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanfnc CXIystdne, D=Aspartic Add, 
E=GIutamic Add, ^Phenylalanine, OGfydne, BNHistidlne, 
1-Isoleudne, K^Lysinc, L**Leudne, M=Methionine, 
N=Asparagine, P^Proline, Q"Glutamine, R»Arginlne, S=Serine, 
T^ITireonine, V^Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nudeotide deletion, 
\=possible nudeotide insertion 










SSAPVSIWSPASISPLSDPLSTSSSCMQRSYPMTYT 
QASGYSQGYAGSTSYFGGMDCGSYLTPMHHQL 
PGPG ATLSPMGTNA VTSHLNQSPASLSTQG YGAS 
KLWGFNFNH 


3813 


A 


1 


1016 


CTEPPRRSTRTPAAIASIJU'YTDYVVVSDQILQES 

EDFFTLIESHEGKPIjajvlVYNSKSDSCREVTVTP 

NAAWGGEGSLGCGIGYGYLHRIPTQPPSYHKKPF 

GTPPPSALPLGAPPPDALPPGPTPEDSPSLETGSRQ 

SDYMEAIXQAPGSSMEDPLPGPGSPSHSAPDPDG 

1JHFMETPLQPPPPVQRVMDPGFLDVSGISLLDN 

SNASVWPSLPSSTELTTTAVSTSGPEDICSSSSSHE 

RGGEATWSGSEFEVSFLDSPGAQAQADHLPQLT 

LPDSLTSAASPEDGLSAELLEAQAEEEPASTEGLD 

TGTEAEGLDSQAQISTTE*HPGL*QGP 


3814 


A 


2 


884 


VFWQVRNAGSSPLSAACPLFRTPAPQPCGSWGR 

CCIPHASTGCRPMAERGELDLTGAKQNTGVWLV 

KVPKYLSQQWAKASGRGEVGKLRIAKTQGRTE 

VSFTLNEDLANDIDIGGKPASVSAPREHPFVLQSV 

GGQTLTVFTESSSDKLSLEGIVVQRAECRPAASE 

NYMRLKRIXJIEESSKPVRLSC^LDKVVTTNYKP 

VANHQYNIEYERKKKEIKjKRARADKQHVLDML 

FSAFEKHQYYNLKDLVDITKQPVVYLKEILKE1G 

VQNVKGIHKNTWELKPEYRHYQGEEKSD 


3815 


A 


17 


411 


nigdwedigksperjiqyygpatwaqdgsrgyct 
piymlnherlqavleiim^ranaldllaqqttk 
mrnany qnrlaldyllaheggv*gkfsltncc 
leiddngkaimeitarmrklahipvqtwer 


3816 

i . 


A 


3 


1172 


SHWQRRDRRCVRNMAERGRKRPCGPGEHGQRI 
EWRKWKQQKKEEKKKWKDLKLMKKLERQRAQ 
EEQA KRLEEEEAAAEKEDRGRP YTLSV ALPGSIL 

OQDAKTVEGEFTGVGKKGQACVQLA . LQYLEC 

PQYLRKAFFPKHQDLQFAGLLNPLDSPHHMRQD 

73ESEFREGVWDRPTRPGHGSFVNCGMKKEVKI 

Dia^LEPGLRVTVRI^QQQHPDCXTYHGKVVSS 

QDPRTKAGLYWGYTVRLASCLSAVFAEAPFQDG 

YDLUGTSERGSDVASAQLFNFRHALWFGGLQG 

LEAGADADPNLEVAEPSVLFDLYVNTCPGQGSR 

TIRTEEAILISLAALQPGLIQAGARHT 


3817 


A 


246 


1197 


FI^AGMSNFTHYAYIJLMIESLMLGKVPPHVPSH 

HFlFHDIXjSARQKGESDYKVnQQWFSKSGPWTT 

SSNVTWGLLELQQSISESAVLTIPPGDSGAGSNLI 

TMFIJWRKETD1XSGRSKVNRGWNSGRCKQRG 

KTEQPGEPLEHVYVTEKHAVALESRHQKGELQC 

LnCMCIPI^KPIXJMFFSPPHWEAWLQRVQQLAK 

NTRYFRQRLQEMGFHYGNENASWPLLLYMPG 

KVAAFARHMLEKKIGWWGFPATPLAEARARF 

CVSAAHTREMLDTVLEALDEMGDLLQLKYSRH 

KKSARPELYDETSFELED 


3818 


A 


215 


789 


NPQSSSSEGSSEIFQVNGHNRLLVQRSBVTQAPG 

QYTVDVEGHGCTHQATLKYNVIJLPKKASGFSLS 

LEIVKNYSSTAFDLTVTLKYTGIRNKSSMVVIDV 

KMLSGFIPTMSSIEEIJ5NKGQVMKTEVKNDHVL 

FYLEKWGRADSFTFSVEQSNLVFNIQPAPGMVY 

DYYEKEE YAL AFYHIN SSSVSE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AIanine OCystelnc, D^Asparttc Add, 
E=Glntamk Add, ^Phenylalanine, G^Glydnt, B»Hbtidine r 
Msoleudne, K=Lysine, L»Leudnc, M=Mcthk>niiic, 
N-Asparagine, P=Proline, <>=Glutainine, R^Arglnine, S=ScriDe, 
^Threonine, V»Valine, W=Tryptophan, Y=Tyroslne, 
X^Unknown, *=Stop codon, /=possaWe nudeotide deletion, 
\=possible nudeotide insertion 


3819 


A 


1 


1483 


RIPDSIISRGVQGLPRDTASLSTTPSESPRAQATSR 

LSTASCPTPKVQSRCSSKENILRASHSAVDITKVA 

RRHRMSPFPLTSMDKAFITVIJBMTPVIXjTEIINYR 

DGMGRVLAQDVYAKDNLPPFPASVKDGYAVRA 

AIXjPGDRFnGESQAGEQPTQTVMPGQVMRVTT 

GAPIPCGADAWQVEDTELIRESDDGTEELEVRIL 

VQARPGQDIRPIGHDIKRGECVLAKGTHMGPSEI 

GUATVGVTEVEVNKFPWAVMSTGNEIXNPED 

DLLPGKIRDSNRSTLLATIQEHGYPTINLGIVGDN 

PDDLLMALNEGISRADVETSGGVSMGEKDYLKQ 

VLDIDLHAQIHFGRVFMKPGIJ > TIPATLDIIDGVR 

KIIFALPGNPVSAVVTCNIJFVWA^ 

RPTIIKARLSCDVKLDPRPEYrlRCILTWHHQEPLP 

WAQSTGNQMSSRLMSMRSANGLLMLPPKTEQY 

VELHKGEWDVMVIGRL 


3820 

l 


A 


2216 


487 

.. 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSF/CVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKIXjECEDVDECAMGTHTC 

qpgfix:qntkgsfycqarqrcmdgflqdpegnc 

vdinectslsepcrpgfscintvgsytcqrnplic 

argyhasddgtkcvdvnecetgvhrcgegqvc 

hnlpgsyrcdckagfqrdafgrgcidvnecwas 

pgrlcqhtcentlgsyrcscasgfllaadgkrc 

edvneceaqrcsqecaniygsyqcycrqgyqla 

EEXjHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

cpeqgytmtangrsckdvdecalgthncseaet 
chniqgsfrclrfecppnyvqvsktkcerttchd 
flecqnsparithyqu^otgixwahifrigpap 

aftgdtia t >!tr ^ eegyt g 1r"> t x ? \ytgwyl 

QRAVLEPRCvaw . 'iMKLWRQGSVTTFLAKMHl 
FFTTFAL 


3821 


A 


2216 


487 


pqepalksefsqy/ sntiplplpqpntckdngpck 

qvcstvggsaicscrfgyaimadgvscedqdecl 

mgardx^rrqfcwtlgsfycvnhtvlcadgyi 

lnahrkcvdinecvtdljitcsrgehcvntlgsf 

hcykaltcepgyaijadgecedvdecamgthtc 

qpgflcqntkgsfycqarqrcmdgflqdpegnc 

vdinectsl^eporpgfscintvgsytcqrnplic 

argyhasddgtkcvdvnecetgvhrcgegqvc 

hnijpgsyrcdckagfqrdafgrgcidvnecwas 

PGRLCQrHX^Eim^jSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECA(^AGEXTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNIIKGNEEGYFGTRRLNAYTGVYYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI 

FFTTFAL 


3822 


A 


2502 


1540 


MAAATRGCRPWGSLLGLLGLVSAAAAAWDLAS 
LRCTLGAFCECDFRPDLPGLECDLAQHLAGQHL 
AKALVVKALKAFVRDPAPTKPLVLSLHGWTGTG 
KSYVSSLLAHYLFQGGLRSPRVHHFSPVLHFPHP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A= Ala nine C=Cysteine, D-Aspartk Add, 
E=Glutamic Add, Phenylalanine, G=Glydne, H«Histidlne, 
I=Isoleudne, K-Lysinc, D=Leurine, M«M ethionine, 
N=Asparagine, P»Proline, Q=Glntaminc, R=Arginine, S= 3 Sertae, 
"MTireonlne, V^Valine, W-Tryptophao, Y=Tyroslne, 
X^Unknown, *=Stop eodon, A*possible nudeotide deletion, 
V=possible nudeotide insertion 










SHIERYKKDLKSWV<^NLTACGRSLFLFDEMDK 

MPPGLMEVLRFFLGSSWVWGTNYRKAIFMSN 

TGGEQINQVALEAWRSRRDREEILLQELEPVISR 

AVUDNPHHGFSNSGIMEERUJDAVWFLPLQRHH 

VRHCVLNEIJVQLGLEPRDEVVQAVLDSTTFFPE 

DEQLFSSNGCKTVASRIAFFL 


3823 

i 


A 


1 


3174 


YGCEKTTEGRIPLKNIYRLFSADRKRVETALEAC 

SLPSSRNDSDPQEDFTPEVYRVFLNNLCPRPEIDNI 

FSEFGAKSKPYLTVDQMMDFINLKQRJ)PRLNEIL 

YPPLKQEQVQVLIEKYEPNNSLARKGQISVDGFM 

RYl^GEENGWSPEKLDLNEDMSQPLSHYFINSS 

HNIYLTAGQLAGNSSVEMYRQVLLSGCRCVELD 

CWKGRTAEEEPVITHGFTiyriTEISFKEVIEAIAE 

AFKTSPFPILLSFENHVDSPKQQAKMAEY CRLIFG 

DALLMEPLEKYPLESGVPIJPSPMDLMYKILVKN 

KKKSHKSSEGSGKKKl^EQASNTYSDSSSMFEPS 

SPGAGEADTESDDDDDDDDCKKSSMDEGTAGSE 

AMATEEMSNLV>T^QPVia^FEISKKJWKSFEM 

SSFVETKGLEQLTKSPVEFVEYNKMQLSRIYPKG 

TRVDSSNYMPQLFWNAGCQMVALNFQTMDLA 

MQINMGMYEYNGKSGYR1JKPEFMRRPDKHFDP 

FTEGIVDGIVANTLSVKIISGQFLSDKKVGTYVEV 

DMFGLPVDTRRKAFKTKTSQGNAVNPVWEEEPI 

VFKKVVOni^CIJUAVYEEGGKFIGHRILPVQAI 

RPGYHYICLRNERNQPLTLPAVFVY1EVKDYVPD 

TYADVIEALSNPIRYVNIMEQRAKQLAALTLEDE 

EEVKKEADPGETPSEAPSEARTTPAENGVNHTTT 

LTPKPPSQALHSQPAPGSVKAPAKTEDLIQSVLTE 

VEAQTOELKC^KSFVKLQKKHYKEMKDLVKR 

HHKKTTDLIKEHTTKYKEIONDYIJIRRAA 

AK^DSKK^EPSSPDKGSS * iHODI P AE> <TO 

K1JDLKDK<^Q^I.LNIJIQEQYYSEKYQKRE } 

mQKLTDVAEECQNNQLKKIJCmCEBCEKKELK>; 1 

KMDKKRQEKITEAKSKDKSQMEEEKTEMIRSYI 

QEWQYIKRIJEEAQSKRQEKLVEKHKEIRQQILD 

EKPKLQVELEQEYQDKFKRLPLEELEFVQEAMKG 

KISEDSNHGSAPLSLSSDPGKVNHKTPSSEELGGD 

IPGKEFiyrPL 


3824 


A 


1 


426 


IlJttWFVHRWSGRNNREKIGVHVGFEEILNMEPY 

CCRETLKSLRPECFIYDLSAVVMHHGKGFGSGH 

YTAYCYNSEGGFWVHCNDSKLSMCTMDEVCKA 

QAYILFYTQRVTCNGHSKLLPPELLLGSQHPNED 

ADTSSNEILS 


3825 


A 


3 


364 


GIRAKFPNKIPVVVERYPRETr^PLDKTXFLVPQ 
ELTMTQrT.SIIRSRMVLRATEAFYLLVNNKSLVS 
MSATMAEIYRDYKDEDGFVYMTYA SQETFGCLE 
SAAPRDGSSLEDRPLHPL 


3826 


A 


1 


1237 


PEKKFERECREAEKAQQSYERLDNDTNATKADV 

EKAKQQLNLRTHMADENKNEYAAQLQNFNGEQ 

HKHFYVVIPQIYKQLQEMDERRTIKLSECYRGFA 

D SERKVIPn SKCLEGMIL AAKS VDERRDS QMW 

DSFKSGFEPPGDFPFEDYSQHTYRHSDG'nSASKQ 

ESGKMDAKTTVGKAKGKLWLFGKKPKGPALED 

FSHLPPEQRRKKLQQRIDELNRELQKESDQKDAL 

NKMTOVYEKNPQMGDPGSLQPKLAETMNl^ 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location - 
corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AJanine OCysteine, D-Aspartic Add, 
FXJIutamlc Add, ^Phenylalanine, G=Glydne, H=Hbtidlne, 
l»IsoIencine, K=Lydne, I^Leudne, Rf=M ethlonine, 
N»Asparaginc, PHProline, Q=Glutamlne, R=Arginine, S-^Serine, 
T=»Threonine, V»Va!ine, W=Tryptophan, Y=Tyroslne, 
X«4inknown, *=Stop codon, ^possible nodeotide deletion, 
V=possible nucleotide insertion 










LRMEIHKNEAWLSEVEGKTGGRGDRRHSSDINH 
LVTQGRESPEGSYTDDANQEVRGPPQQHGHHNE 
FDDEFEDDDP1JPAIGHCKAIYPFIX5HNEGTLAMK 
EGEVLYIIEEDKGDGWTRARRQNGEEGYVPTSYI 
DVTLEKNSKGS 


3827 


A 


2 


1584 


INP VSS AVNGEAHS SHETRGQNSN ALPS V LLELL 

SQSCLIPAMSSYIJWDSVLDMARHVPLYRALLEL 

LRAIASCAAMVPLLLPLSTENGEEEEEQSECQTS 

VGTLJLAKMKTCVDTYTNRLRSKRENVKTGVKP 

DASDQEPEGLH.LVPDIQKTAEIVYAATTSLRQA 

NQEKKLGEYSKKAAMKPKPLSVLKSLEEKYVAV 

MKKLQrT)TFEMVSEDEI>3KI.GFKWYHYMSQV 

KNANDANSAARARRLAQEAVTLSTSLPLSSSSSV 

FVRCDEERLDMKVLITGPADTPYANGCFEFDVY 

FPQDYPSSPPLVNLETTGGHSVRFNPNLYNDGKV 

CI^ILNTWHGRPEEKWNPQTSSFLQVLVSVQSLI 

LVAEPYFNEPGYERSRGTPSGTQSSREYDGNIRQ 

ATVKWAMIJBQIRl^PCFKEVIHKHFYLKRVEIM 

AQCEEWIADIQQYSSDKRVGRTMSHHAAALKRH 

TAQLREELLKLPCPEGLDPDTDDAPEVCRATTGA 

EBTLMHEKJVKPSSSKELPSDFQL 


3828 


A 


1415 


845 


PRVPATLVSIJDPWHCFPTAGRLAGSTWVPPACT 

LQLGPSSEHELDNHRAPLLSLPSQESLSFTPWYLV 

ACKPIJHIFCPLFACFMQEGKVQyLFLHLSHMRL 

LNYYFFPFLAPESLMQALEDLDYLAALDKDGNL 

SEFGHMSEFPLDPQLSKSILASCEFDCVDEVLTIA 

AMVTGELNDYSFSFFANLH 


3829 


A 


199 


683 


VDHTPVLSKPQCFSSVKWGATLSARSQKTSGIGR 
LMVHVBEATELKACKPNGKSNPYCEISMGSQSYT 
TRTIQDTLNPKWNFNCQFFTKDLYQDVLCLTLFD 

mdcfspddfi c v 17^?yaki^^eq^2::cpn:trbll 

LHEVPTGEVv. VRjf :>LFEQKTLL 


3830 


A 


1747 


404 


RKMMEESGffiT: ^PGTPPPNPAGLAATAMS STP V 

PLAATSSFSSPNVS3I^SFPPLAYSTPQPPLPPVRP 

SAPLPFVPPPAVPSVPPLVTSMPPPVSPSTAAAFG 

NPPVSHFPPSTSAPNTLLPAPPSGPPISGFSVGSTY 

DITRGHAGRAPQTPLMPSFSAPSGTGLLPTPITQQ 

ASLTSLAQGTGTTSAITFPEEQEDPRITRGQDEAS 

AGGIWGFIKGVAGWNfVlCSVLDKTKHSVESMIT 

TIJ)PGMAPYIKSGGEIJ)IVVTSNKE\aCVAAVRD 

AFQEVFGLAVWGEAGQSNIAPQPVGYAAGLKG 

AQERIDSLRRTGVIHEKQTAVSVENFIAELLPDK 

WFDIGCLWEDPVHGIHLETFTQATPVPLEFVQQ 

AQSLTPQDYNLRWSGLLVTVGEVLEKSLLNVSR 

TDWHMAFTGMSRRQMIYSAARAIAGMYKQRLP 

PRTV 


3831 


A 


5 


674 


FWTRSAWHEGLQQMKANDPSLQEVNLYNIKNIP 

IPTLREFAKALETNTHVKKFSLAATRSNDPVAIAF 

ADMLKVNTTLTSLNIESHFITGTGILALVEA1JCEN 

DTLTEIKTONQRQQIX3TAVEMEIAQMLEENSRIL 

KFGYQFIXQGPRTRVAAAITKNNDLAWQKDTQ 

EQTSrWQWSQSIAGFNPQFEVQGQNARSWMEE 

LGKAFHQFVRRELKQTEGKLP 


3832 


A 


164 


782 


EPWVPMDVAESPERDPHSPEDEEQPQGLSDDDIL 
RDSGSDQDLDGAGVRASDLEDEESAARGPSQEE 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCysteine, D=Aspartic Add, 
&=Glutamic Add, ^Phenylalanine, OGlydne, H=Hi5tidine, 
f=lso]eudne, K=Lysjne, L^Lendne, M=Methionine, 
N=»Asparaginc P^ProIlne, Q=Glntamine, R^Arginine, S-Serine, 
T^Threonine, V«Valine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *«Stop codon, ^possible nucleotide deletion, 
V=possible nudeotide insertion 










EDNHSDEEDRASEPKSQDQDSEVNELSRGPTSSP 

CEEEGDEGEEDRTSDLRDEASSVTRELDEHELDY 

DEEVPEEPAPAVQEDEAEKAGAEDDEEKGEGTP 

REEGKAGVQSVGEKESLEAAKEKKKEDDDGEID 

DEEMY 


3833 


A 


122 


1676 


SQPPHFTQKMNENKDTOSKKSEEYEDDFEKDLE 

WLINENEKSDASIIEMACEKEENINQDLKENETV 

MEHTKRHSDPDKSLQDEVSPRRNDnSVPGIQPLD 

PISDSDSENSFQESKLESQKDLEEEEDEEVRRYIM 

EKIVQANKLLQNQEPVNDKRERKLKFKDQLVDL 

EWPLEDTTTSKNYFENERNMFGKLSQLCISNDF 

GQEDVLLSLTNGSCEENKDRTELVERDGKFELLN 

LQDIASQGFLPPINNANSTENDPQQLLPRSSNSSV 

SGTKKEDSTAKIHAVTHSSTGEPLAYIAQPPLNR 

KTCPSSAVNSDRSKGNGKSNHRTQSAHISPVTST 

YCLSPRQKELQKQLEEKREKLKREEERRKIEEEK 

EKKRENDIVFKAWIXJKKREQVLEMRRIQRAKEI 

EDMNSRQENRDPQQAFRLWLKKKHEEQMKERQ 

TEE1JIKQEECLFFLKGTEGRERAFKQWIJUIKRM 

EKMAEQQAVRERTRQLRLEAKRSKQLQHHLYM 

SEAKPFRFTDHYN 


3834 


A 


575 


774 


RSRTEELSNSGILKAMSKDLVTFGDVAVNFSQEE 
WEWLNPAQRNLYRKVMLENYRSLVSLGKDMSP 


3835 


A 


2 


100 


ASDFYLRYYVGHKGKFGHEFLEFEFRPDGVYV 


3836 


A 


91 


749 


RPTPGHGDFWMQPLTKDAGMSLSSVTLASALQV 

RGEALSEEEIWSLLFLAAEQLLEDLRNDSSDYW 

CPWSALLSAAGSLSFQGRVSHIEAAPFKAPELLQ 

GQSEDEQPDASQMHVYSLGMTLYWSAGFHVPP 

HQPLQLCEPLHS1LLTMCEDQPHRRCTLQSVLEA 

CRVHEKEVSVYPAPAGLHIRRLVOT VLGT1SEVS 

?£PCFSS3SCW3CVAIKI 


3837 


A 


3 v 


1214 


SLGCTNSARGKGQDi /kTLMANGAPFTTDWFS " 

KLRVSCGYIGDNCKNGADVNAKDMLKMTALH 

WATBRHHRDVVELLIKYGADVHAFSKFDKSAFD 

IALEKNNAEILVILQEAMQNQVNVNPERANPVTD 

PVSMAAPFDTSGEVVN1ASUSSTNTKTTSGDPH 

ASTVQFSNSTTSV1ATLAALAEASVPLSNSHRAT 

ANTEEIIEGNSVDSSIQQVMGSGGQRVITIVTE>GV 

PIXjMQTSIPTGGIGHPFIVTVQDGQQVLTVPAGK 

VAEETVIKEEEEEKLPLTKKPRIGEKTNSVEESKE 

GNERELLQQQLQEANRRAQEYRHQLLKKEQEAE 

QYRIJCIJEAIAR<^PNG\nDFTMVEEVAEVDAVV 

VTEGELEERETKVTGSAGATGPPTRVSMATVSS 


3838 


A 


1 


1332 


miednkenkdhslergraslifslknevgglika 

ljk:ifqekhvnu.hiesrkskjirnsefeifvdcdin 

reqlndifhllkshtnvlsvnlpdnftlkedgme 

tvpwfpkkisdldhcanrvlmygseldadhpgf 

kdnvyrkrrkyfadiamnykhgdpipkveftee 

eiktwgtvfqelnklypthacreyijcnlpllsky 

cgyrednipqledvsnflkertgfsirpvagylsp 

rdfwglafrvfhctqyvrhssdpfytpepdtch 

ellghvpllaepsfaqfsqeiglaslgaseeavq 

klatcyfftvefglckqdgqlrvfgagllssise 

lkhai^ghakvkpfdpkttckqeclittfqdvyf 

vsesfedakekmreftktikrpfgvkynpytrsi 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A-Atanine OCysteine, D=Aspartic Add, 
E=Glutamic Add, F=Pheny lata nine, G=Glydne, H=HUtidine, 
I^Isoleucine, K=Lysine, LHLeudne, M^MethJonine, 
f^Asparaglne, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W^Tryptophan, Y^iyrosine, 
X=Unknown, *=Stop codon* /^possible nudeotide deletion, 
V-possible nudeotide insertion 










QILKDTKSITSAMNELQHDLDVVSDALAKVSRKP 
SI 


3839 


A 


3093 


520 


MVNFIVIXJIRAIMDKKANIKNMSVIAHVDHGKS 

TLTDSLVCKAGIIASARAGETRFTDTRKDEQERCI 

TKSTAISLFYELSENDLNFIKQSKDGAGFLINLID 

SPGHVDFSSEVTAA1JRVTDGALVVVDCVSGVCV 

QTETVOIQAIAERIKPVL^IMNKMDRAIJLELQLE 

PEELYQTFQRIVENVNVIISTYGEGESGPMGNIMI 

DPVLGTVGFGSGLHGWAFTLKQFAEMYVAKFA 

AKGEGQLGPAERAKKVEDMMKKLWGDRYFDP 

ANGKFSKSATSPEGKKLPRTFCQLILDPIFKVFDA 

IMNFKXEETAKLIEKLDDCLDSEDKDKEGKPLLK 

AVMRRWLPAGDALLQMTTIHLPSPVTAQKYRCE 

LLYEGPPDDEAAMGDCSCDPKGPLMMYISKMVP 

TSDKGRFYAFGRWSGLVSTGLKVRIMGPNYTPG 

KKEDLYLKPIQRTELMMGRYVEPIEDVPCGNIVG 

LVGVDQFLVKTGTITTFEHAHNMRVMKFSVSPV 

VRVAVEArO^ADIJPKLVEGUCRIJVKSDPMVQCI 

IEESGEHIIAGAGELHLEICLKDLEEDHACIPIKKS 

DPVVSYRETVSEESNVLCI^KSPNKHNRLYNiKA 

RPFPDGLAEDIDKGEVSARQELKQRARYLAEKY 

EWDVAEARKIWCFGPDGTGPNILTDriKGVQYL 

NEIKDSWAGFQWATKEGALCEENMRGVRFDV 

HDVTLHADAIHRGGGQIIPTARRCLYASVLTAQP 

RLMEPIYLVE1QCPEQWGGIYGVLNRKRGHVFE 

ESQVAGTPMFWKAYLPVNESFGFTADLRSNTG 

GQAFPQCVFDHWQILPGDPFDNSSRPSQVVAETR 

KRKGLKEGIPALDNFLDKL 


3840 


A 


2 


753 


SSTRSRDFCCSEAIQGSLTRRERRASGVRTRRSQG 
SSAMASKILLNVQEEVTCPICLELLTEPLSLDCGH 
SIXR.-Cir/SNVEAVTSMOGKSSmCGTSYSFE ; 
:iLQANQHI^NlVEROCEVKLSPI>NGK • . :,CDH 
HGEKLLLFCKEDRKVICWIX^ERSQEHRGHHTVL 
TEEVFKECQEKLQAVIJCRIJCKEEEEAEKLEADIR 
BBKTSWKYQVQTERQRIQTEFDQLRSELNNEBQR 
ELQRLEEEEKKT 


3841 


A 


2 


405 


GKAFSCFIYI^QHRRTHMAEKPYECKrcKKAFS 

HFGNLKVHERIHTGEKPYECKECRKAFSWLTCL 

LRHERIHTGKKSYECQQCGKAFTRSRFLRGHEKT 

rnXjEKMHECKECGKALSSI^SLHRHKRTHWRDT 

L 


'3842 


A 


311 


88 


AVUCNMAPMTALGLLDLHILNLILFLSAGEDFTS 
WSEIMMmLVrXTLWIXIEMIYCYRKVSKAEE 
AAQENA 


3843 


A 


3 


1175 


APnWSRTODFVRRVESKATSARCGLWGSGPRRR 

PASGMFRGLSSWLGLQQPVAGGGQPNGDAPPEQ 

PSETVAESAEEELQQAGDQELLHQAKDFGNYLF 

NFASAATKKITESVAETAQTIKKSVEEGKIDGIID 

KTnGDFQKEQKKFVEEQHTKKSEAAVPPWVDT 

NDEETIQC^IIJVI^ADKRNFLRDPPAGVQFNFDF 

Dy M Y Jr V AL V ML(^ EDbLJ^JsAlRr Vr KLVKEE 

VFWRNYFYRVSLKQSAQLTALAAQQQAAGKEE 

KSNGREQDLPLAEAVRPKTPPWDCSQLKTQEDE 

EEISTSPGVSEFVSDAFDACNLNQEDLRKEMEQL 

VLDKKQEETAVLEEDSADWEKELQQELQEYEV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine OCystdne, D=Asparric Add, 
E°Giutamfc Add, ^Phenylalanine, G=Glydne, BNHistidine, 
l=>Isoleudne, KpLyslne, L^Leurine, M^Methlonine, 
N=Aspflraginc, Proline, Q=Glutamine, R-Arginlne, S»Serine, 
Threonine, V«VaIlne, W=Tryptophan, Y-iyrosine, 
X^Unknown, *«Stop codon, /^possible nudeotidc deletion, 
V=possible nudeotlde Insertion 










VTESEKRDENWDKEIEKMLQEEN 


3844 


A 


798 


148 


LPPAQIPEAWLUJWVWVLILVPIJKDRLroPLL^ 

RCKLLPSALQKMALGMFFGFTSVIVAGVLEMER 

LHYIHHNETVSQQIGEVLYNAAPLSIWWQPQYL 

LIGISEIFASIPGLEFAYSEAPRSMQGAIMGIFFCLS 

GVGSLLGSSLVALLSLPGGWLHCPKDFGNINNCR 

MDLYFFLLAGIQAVTALLFVWIAGRYERASQGP 

ASHSRFSRDRG 


3845 


A 


3 


1934 


PEDSAPQYSRLrTNASQHITPSYNYAFNPDKrIWI 

MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 

METIYNMLVETGEIJDhrmVYTADHGYHIGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 

O^IDLAPTILDIAGLDIPADMDGKSIIJQXDT^ 

VNRFHLKKKMRVWRDSr^VERGKLLHKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLiCLHKCKGPMRLGGSRALSN 

LWKYYGQGSEACTCDSGDYKLS1AGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 

AWKDHKIJIIDHEIETLQNKIKNLREVRGHIJC^ 

RPEECDCHKISYHTQHKGRLKHRGSSLHPFRKGL 

QEKDKVWLLREQKRKKKLRKLLKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTENETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 


3846 


A 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 
MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 
I3&?zl IM^VETTIJ.; , ' r,.7VYI ADiiCYin3QFG 
L VKGKSMPYEFl;^ KV tr 1 V RGPNVEAGCLNPHIV 
LNIDLAFTtLDIAGLT lPADMDGKSILKLLDTERP 
VNRFHLKKKMRVWKD^7LVERGKLXHKRDNDK 
VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

qkwqcvedatgklklhkckgpmrlggsralsn 
lwkyygqgseactcdsgdykl5lagrrkklfk 
kkykas yvrsrsirsvaievdgrvyhvglgdaa , 
qprnltkrhwpgapedqddkdggdfsgtgglp 
dysaanpkvthrcyilendtvqcdldlykslq 

AWKDHKLHTOHEffiTLQNKIKNLREVRGHLKKK 

RPEECDCHKISYfTIX5HKGRLKHRGSSLHPFRKGL 

QEKDKVWLLREQKRKKKLRKLLKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHOT^ 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 


3847 


A 


I 


1257 


MWSAVLTAFHTGTSNTTFVVYE>riYMNrrLPPP 
FQHPDLSPLLRYSFETMAPTGI^SLTVNSTAVPTT 
PAAFKSLNO>LQITI^AnvlIFILFVSFLGNLVVCLM 
WQKAAMRSAINILLASLAFADMLLAVLNMPFA 
LVTELTTRW1FGKFFCRVSAMFFWLFVIEGVA1LL 
ESIDRFLIIVQRQDKLNPYRAKVLIAVSWATSFCV 
AFPLAVGNPDLQIPSRAPQCWGYTTNPGYQAYV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A^Aianine OCysteine, D=Aspartic Add, 
E^GIutamtc Add, Phenylalanine, OGlydne, H=Histidine, 
I^IsolendneiK^Lysine, L^Lcudne, M=Methjonine, 
N=Asparagtne, PNfroline, Q=Glutamine, R«Arginlne, S^Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X=Unkno>m, *=Stop codon, ^possible nndeotide deletion, 
V=possible nndeotide insertion 










ILISUSFFIPFLVn.YSFMGILNTUlHNAlJUm 
GICLSQASKLGLMGLQRPFQMSIDMGFKTRAFTT 

ililfawivcwapfityslvatfskhfyyqhnff 
eistwllwix:ylksai^liyywrikkfhdaci^ 
mmpksfkflpqi^ghtbcrrn^savyvcgehrt 

W 


3848 


A 


3 


2827 


SSAVAAIOUlRSWASLVIJVFLGV<XGITLAVDRS 

NFKTCEESSFCKRQRSIRPGLSPYRALLDSLQLGP 

DSLTVHLIHEVTKVIXVLEI^GLQKNMTRrTU^ 

LEPRRPRYRVPDVLVADPPIARLSVSGRDENSVE 

LTMAEGPYKIILTARPFRLDLLEDRSLLLSVNARG 

LLEFEHQRAPRVSQGSKDPAEGDGAQPEETPRD 

GDKPEETQGKAEKDEPGAWEETFKTHSDSKPYG 

PMSVGLDFSLPGMEHVYGIPEHADNLRLKVTEG 

GEPYRLYNLDWQYELYNPMALYGSVPVIJLArIN 

PHRDLGIFWLNAAETWVDISSKTAGKTLFGKMM 

DYLQGSGETPQTDVRWMSETGIIDVFLLLGPSISD 

VFRQYASLTGTQALPPLFSLGYHQSRWNYRDEA 

DVl^VDQGroDHNLPCDVIWLDffiHADGKRYrT 

WDPSRWQPRTMimLASKRRKLVAIVDPHIKVD 

SGYRVHEElJtNLGLYVKTRDGSDYEGWCWPGS 

AGYPDFIWTMRAWWANMFSYDNYEGSAPNLF 

VWMDMNEPSVFNGPEVTMLKDAQHYGGWEHR 

DVHNIYGLYVHMATADGLRQRSGGMERPFVLA 

RAFFAGSQRFGAVWTGDNTAEWDHLK1SIPMCL 

SLGLVGLSFCGADVGGFFKNPEPELLVRWYQMG 

AYQPFFRAHAHLDTGRREPWLLPSQHNDIIRDAL 

GQRYSLLPFWYTLLYQAHREGIPVMRPLWVQYP 

QDVTTFNIDDQYLLGDALLVHPVSDSGAHGVQV 

YLPGQGEVWYDIQSYQKHHGPQTLYLPVTLSSIP 

Vr^ROGTr^RU^VRF.SRH^<KDJPrnFVALS : 

•?QGTAQGELFLDDGHTf-m QT1LQEF1. .^SFSG 

NTLVSSSADPEGHFETPIWIERWnGAGKPAAVV 

LQTKGSPESRLSFQHDPETSVLVLRKPGrNVASD 

WSIHLR 


3849 


A 


1 


1717 


RARNARGCWGVCRSGFSSAVCGAARMEQVAEG 

ARVTAVPVSAADSTEELAEVEEGVGWGEDNDA 

AARGAEAFGDSEEDGEDVFEVEKILDMKTEGGK 

VLYKVRWKGYTSDDDTWEPEIHLEDCKEVLLEF 

RKKIAE>nCAKAVRKDIQRLSLNNDIFEANSDSDQ 

QSETKEDTSPKKKXKKLRQREEKSPDD1JCKKKA 

KAGKLKDKSKPDLESSLESLVFDLRT1GCRISEAK 

EELKESKKPKKDEVKETKELKKVKKGEIRDIJ^ 

KTREDPKENRKTKKEKFVESQVESESSVLNDSPF 

PEDDSEGLHSDSREEKQNTKSARERAGQDMGLE 

HGFEKPLDSAMSAEEDTOVRGRRKKKTPRKAED 

TRENRKLENKNAFLEKKTVPKKQRNQDRSKSAA 

ELEKLMPVSAQTPKGRRLSGEERGLWSTDSAEE 

DKKIXRI^ESKKPKJKDEVKETKELKKVKXGEIRD 

LKTKTREDPKENRKTKKEKFVESQVESESSVLND 

SrTPEDDSEGUiSDSREEKQNTXSARERAGQDM 

GLEHGFEKPLDSAMSAEEDTDVRGRRKKKTPRK 

AEDTRENRKLENKNAFLEKKTVPKKQKNQDRSK 

SAAELEKLMPVSAQTPKGRRLSGEERGLWSTDS 

AEEDKETKJWESKXPKKDEVKET^ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A~Alanine OCysteine, I>=Aspartic Add, 
E=G!utamic Add, ^Phenylalanine, OGIycine, B^Hiitidine, 
I»lsoleudne, K«Lysine, LHUudne, M»Methioaine, 
N=»Asparagine, P^ProUne, Q=Glutamine, R=Arginine J S=Serine, 
T=Tnreonine, V«Vallne, W^Tryptophan, Y=Tyrosine, 
X s Unknown, *«Stop codon, /^possible nudeotide deletion, 
\=possible nucleotide insertion 










IRDLKTKTREDPKENRKTKKEKFVESQVESESSV 
LNDSPFPED/RC^RATFRQQREEKSPDDLKKKKA 
KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 
EELKESKKPK 


3850 


A 


1113 


3975 


paaaaaaaaaaaaaagrgpsftpcfspslaveps 

rrtrlgsdp aq amagnvkks s g agggsg sgg s 

gsggliglmkdafqphhhhhhhlsphppgtvdk 

kmvekcwklmdkvviux:qnpklalknsppyil 

dllpdtyqhlrtilsryegkmetlgeneyfrvf 

menlmkktxqtisijtcegkermyeensqprrnl 

tki^lifshmlaelkgifpsglfqgdtfritkada 

aefwrkafgekttvpwksfrqalhevhpissgle 

amalkstmltcndyisvfefdiftrlfqpwssll 

r>t^siav7hpgymafltydevkarlqkfih^ 

GSYlFRI^CTRLGQWAIGYVTAIXj>m.(^HNKP 

LFQALIDGFREGFYLFPDGRNQNPDLTGLCEPTP 

QDHIKVTQEQYELYCEMGSTFQLCKICAENDKD 

VKIEPCGHLMCTSCLTSWQESEGQGCPFCRCEIK 

GTEPIWDPFDPRGSGSLLRQGAEGAPSPNYDDD 

DDERADDTLFMMKELAGAKVERPPSPFSMAPQA 

SLPPVPPRLDLLPQRVCVPSSASALGTASKAASGS 

LHKDKPLPVPPTLRDLPPPPPPDRPYSVGAESRPQ 

RRPLPCTPGDCPSRDKLPPVPSSRLGDSWLPRPIP 

KVPVSAPSSSDPWTGRELTNRHSLPFSLPSQMEP 

RPDVPRLGSTFSLDTSMSMNSSPLVGPECDHPKI 

KPSSSANAIYSLAARPLPVPKLPPGEQCEGEEDTE 

YMTPSSRPLRPLDTSQSSRACDCDQQIDSCTYEA 

MYNIQSQAPSITESSTFGEGNLAAAHANTGPEES 

ENEDDGYDVPKPPVPAVLARRTLSDISNASSS/FG 

LFVLERDP*PONVTEGSQVPERPPKPFPRRINSER 

KAGSCQQC "?-AA c A i TAV^QI"5" TT iNLN^QC 

v TxQDIQKi LVIAQNNIEMA. 1 , Nii^VsiSSPAH 

VAT 


385i 




2 


2781 


GRVGSMDGAMGPRGLLLCMYLVSl 7.JLQAMPA 

LGSATGRSKSSEKRQAVDTAVDGVFS3LKVNC 

KVTSRFAHYVVTSQVVNTANEAREVAFDLEIPK 

TAFISDFAVTADGNAFIGDDCDICVTAWKQYRKA 

AISGENAGLVRASGRTMEQFHHLTVNPQSKVTF 

QLTYEEVLKRNHMQYEIVIKVKPKQLVHHFEIDV 

DIFEPQGISKIJDAQASFLPKEI^AAQTIKKSFSGKK 

GHVIJFTU^SC^SCPTCSTSLLNGHFKVTYDVS 

RDKICDLLVANNHFAHITAPQNLTNMNKNVVF^ 

H)ISGSMRGQKVKQTKEALLK1LGDMQPGDYFD 

LVLFGTRVQSWKGSLVQASEANLQAAQDFVRGF 

SU)EATNIJ^GGIJLRGffiILNQVQESLPELSNHASI 

LIMLTDGDPTEGVTDRSQILKNVRNAIRGRFPLY 

NLGFGHNVDFNFLEVMSMENNGRAQRIYEDHD 

ATQQLQGFYSQVAKPLLVDVDLQYPQDAVLALT 

QNHHKQYYEGSEIWAGRIADNKQSSFKADVQA 

HGEGQEFSrra.VDEEEMKKLLRERGHMLENHV 

ERLWAYLT1QELLAKRMKVDREVRANLSSQALR 

MSLDYGF^TPLTSMSIRGMADQDGLKPTIDKPSE 

DSPPLEMLGPRRTFVLSALQPSPTHSSSNTQRLPD 

RVTGVDTDPHFIIHWQKEDTLCFN1NEEPGVILS 

LVQDPNTGFSVNGQLIGNKARSPGQHDGTYFGR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCystdne, B=Aapartic Add, 
E^Glutamic Add, ^Phenylalanine, G=Glydne, H»Hbttdine, 
I=lsoleudoe, K^Lysine, l^Leudne, M~Methlonine, 
N^Asparagine, P=Proline, Q=€!ufamlne, K^Arginine, S=Serine, 
^Threonine, V«Valine, W-Tryptophan, Y^Tyroslnt, 
X^Unknown, *«Stop codon,/==possible nudeotide deletion, 
^possible nudeotf de insertion 










LGIANPATDFQLEVTPQMTLNPGFGGPVFSWRD 
QAVIJEU5IXjVVVTINKKR^ 

uirvw:gss\vhqdflgllmcwdksigmsspgr 
kck:wgq\tthpirflkvs*hpppgsdpqkaqmpt 
mvvrnppgltvtarglqkdyskdpwhgaevsc 
wfi\hnnga*atdcaytdyi\vpdif 


3852 


A 


39 


1735 


TQVAEAGRGEGWAGAETGRPQSAGMNLELLES 

FGQKYPEEADGTLDCISMALTCTrT^WGTLLAV 

GCNDGRIVIW\DF\LTRGIA*NKFSAHIHPVCSLC 

WSRDGHKLVSASTONWSQWDVLSGDCDQRFRF 

PSPIIJCVQYHPRBQNKVLVCPMKSAPVMLTLSD 

SKHWLPVDDDSDLNWASFDRRGEYIYTGNAK 

GKILVLKTDSQDLVASFRVTTGTSNTTAIKSIEFA 

RKGSCFLINTADRHRVYDGREILTCGRDGEPEPM 

QKLQDLVNRTPWKKCCFSGDGEYTVAGSARQH 

ALYIWEK5IGNLVKIIJIGTOGELLLDVAWHPVRP 

DASISSGWSIWAQNQVENWSAFAPDFKELDEN 

VEYEERESEFDIEDEDKSEPEQTGADAAEDEEVD 

VTSVDPIAAFCSSDEELEDSKALLYLPIAPEVEDP 

EENPYGPPPDAVQTSLMDEGASSEKKRQSSADG 

SQPPKKKPKT1TOTLQGVPNDEVHPLLGVKGDG 

KSKKKQAGRPKGSKGKEKDSPFKPKLYKGDRGL 

PLEGSAKGKVQAELSQPLTAGGAISELL 


3853 

[ - - 


A 


45 


2603 


PLLFTCGREVRARDPEKEGTtWAGLKVQVQPRF 

LWILCFSMEETQGELTSSCGSKTMANVSLAFRDV 

SIDLSQEEWECLDAVQRDLYKDVMLENYSNLVS 

LDLEYKYITKNLl^EKNVCKIYLSQLQTGEKSKN 

TIHEDTIFRNGLQCKHEFERQERHQMGCVSQMLI 

QKQISHPLHPKIHAREKSYECKECRKAFRQQSYLI 

QHLRIHTGERPYKCMECGK AFCRVGDLRVHHTI 

HAGERPYECKECGKAFRLi; UT^C^THSCVT 

PYECKECGKA KvRDIJlVHQTniAGERPVii^ < j 

ECGKAFRLHYQLTEHQRIHTGERPYECKVCGKf | 

FRVQRHISQHQKIHTGVKPYKCNECGKAFSHGS 

YLVQHQKIHTGEKPYECKECGKSFSFHAELARH 

RRIHTGEKPYECRECGKAFRLQTELTRHHRTHTG 

EKPYECKECGKAnCGYQLTLHIJITErrGEIPYEC 

KECGKTFSSRYHLTQHYRIHTGEKPYICNECGKA 

FRL(^ELTRHHRIHTCEKPYECKECGKAFIHSNQ 

FISHQRIHTSESTYICKECGKIFSRRYNLTQHFKIH 

TGEKPYICNECGKAFRFQTELTQHHRIHTGEKPY 

KCTECGKAFIRSTHLTQHHRIHTGEKPYECTECG 

KTFSRHYHLTQHHRGHTGEKPYICNECGNAFICS 

YRLTLHQRIHTGELPYECKECGKTFSRRYHLTQH 

FRLHTGEKPYSCKECGNAFRLQAELTRHHTVHTG 

EKPYKCKECGKAFSVNSELTRHHRIHTGEKPYQC 

KECGKAFIRSDQLTLHQ\KIILW\NPMHNV^ 

WPLENAL*QRICNLROTLFVTEHVGlPFTSCSQn 

RNYFVC 


3854 


A 


108 


894 


1^SC\VVPGIPWPSVGWLSWLKDLPSCEIHSASLS 
AVLQGPQCSEMLWPKNLTSWDDSSSVSSGISDTI 
DNLSTDDIOTSSSISSYANTPASSRKNLDVQTDAE 
KHSQVERNSLWSGDDVKKSDGGSDSGIKMEPGS 
KWRRNPSDVSDESDKSTSGKKNPVISQTGSWRR 
GMTAQVGITMPRTKASAPAGALKTPGTGKRPGL 



4S1 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/DS01/04098 



SEQJD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AianIne OCystdne, D^Aspartfc Add, 
E=GIntamic Add, ^Phenylalanine, G=Grydne, H=fiistidlnc, 
l^Lsoleacine, K=Lysine, I^Lcucine, M=Metbk>nine> 
N=*A5paragine, P=Prollne, Q~Glutamine, R=Arginint, S-"Serine, 
T-Threonine, V»Valine, W=Tryptophan, Y-Tyroslne, 
X=Unknoim, *^5top codon, ^possible nucleotide deletion, 
V=possfble nudeotide insertion 










S\GPGAPTPAAPPQLAIIMA\VAFSLSAASTPAVSP 
STSPSAVEGSPATILPLASSPPPRTTP^LPLSELTV^ 
RPQELVRGRGCLGPGAPTPAAPPQLARMAWAFS 
LSAASTPAVSPSTSPSAVEGSPATILPLASSPPPRT 
TP 


3855 


A 


1 


772 


FRGGIX5APGVOCPGNPLPFPLPPLQYPPPSTLSHS 

DNLAMTSRSTARPNGQPQASKICQFKLVLLGESA 

VGKSSLVLRFVKGQFHEYQESTIGAAFLTQSVCL 

DDTTVKFEIWDTAGQERYHSLAPMYYRGAQAAI 

VVYDITNQETFARAKTWVKELQRQASP\SIVVGL 

AGNKADLANKRMVEYEEAQAYADDNSLLFMET 

SAKTAMNVNDLFL\AIA*EVAKRVNPQNLG\G\A 

AGRSRGVDLHEQS\QQNKSQCCSN 


3856 


A 


2815 


352 

* * 

* 


LGLEAAARPRPGGPAAMQDGNFLLSALQPEAGV 

CSLALPSDLQLDRRGAEGPEAERLRAARVQEQV 

RARLLQLGQQPRHNGAAEPEPEAETARGTSRGQ 

YHTLQAGFSSRSQGLSGDKTSGFRPIAKPAYSPA 

SWSSRSAVDLSCSRRLSSAHNGGSAFGAAGYGG 

AQPTPPMPTRPVSFHERGGVGSRADYDTLSLRSL 

RLGPGGLDDRYSLVSEQLEPAATSTYRAFAYER 

QASSSSSRAGGUDWPEATEVSPSRHRAPAVRTL 

QRFQSSHRSRGVGGAVPGAVLEPVARAPSVRSLS 

LSLADSGHLPDVHGFNSYGSHRTLQRLSSGFDDI 

DLPSAVKYLMASDPNLQVLGAAY1QHKCYSDAA 

AKKQARSLQAVPRLVKLFNHANQEVQRHATGA 

MRNLIYDNADNKLALVEENGIFELLRTLREQDDE 

LRK>TVTGILWNLSSSDHLKDRLAKKTPI^QLT\D 

LGV*APLSGAGGPP\LIQQNASEAEIFYNATGFPR 

NLSSASQATRQKMRECHGLVDALVTSENHALDA 

GKCEDK S VEN A VCVLRNLSYRLYDEMPPS ALQR 

J "/IRGRRDLA ~* rPGEVVGCrTFQSrjlLR^I^LA 

AL.VLTFAEVSKDPKGLEWLWSPQIVGLY; L^Q 

RCELNRHTTEAAAGALQNITGG\DPRGPGGLSRL 

A? BQERILNPLLDRVRTADHHQLRSLTGLERNLS 

RNA3!NKDEMSTKVV\SHLI^^ 

VLW\tmAy¥NNLGm,AS?VALARDU,Y7XX}l^ 

LIFIKKKRDSPDSEKSSRAASSLlJ^NLWQYNKLH 

RDFRAKGYRKEDFLGP 


3857 


A 


1034 


204 


VAVTLLSQLPSAIQRTAAWEMRAPLTFRVPLALD 

LIKPEHCTVNVDNSLSIPVIAAELVVRKPSEKGM 

(^KKKTKDLGFRAGKESKTE WRK* GLQDMASQ 

MFALPLK*PVTAAFHDSSMPSSLLQIEMEQLFLE 

ARLQ/PDSKSEARRN(^DSMLLRNQQLCSTCQE 

MKMVQPRTMKIPDDPKASFENCMSYRMSLHQP 

KFQTTPEPFHDDIPTENIHLQNL/PILGPRTAVFHG 

LLTEAYKTLKERQRSSLPRKEPIGKTTEAVSGRSS 

SPPRLPERK 


3858 


A 


203 


3469 


SHQEIEQNSAMAPRKRGGRGISFIFCCFRNNDHPE 

ITYRLR1©SWAIX3TMEPA1J , MPPVEELDVMFSE 

LVDELDLTDKHREAMFAI^AEKKWQIYCSKKK 

DQEENKGATSWPEFYTOQLNSMAARKSLLALEK 

EEEEERSKTIESLKTAIJ^TKPMRFVTRFIDIJXjLS 

CIL^KTMDYETSESRIHTSLIGCIKALMNNSQG 

RAHVlJUiSESI>A^QSLSTENIKTKVAVLEELGA 

VCLVPGGHKKVIXJAMLHYQCTASERTRFQTLIN 
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S£QD> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Aianine OCysteine, D=Aspartic Add, 
E^GIutamJc Add, ^Phenylalanine, G=Gfydne, H^ffistidine, 
I=Isoteudne, K=Lysine, L*=Leudne, M=Metniontne, 
N^Asparagine, P^Proline, Q=«Giutamintt R=Arginine, S=Serint, 
T^Threonine, V»VaJlne, W«Tryptophan, Y^Tyrosine, 
X B Unknown, *™Stop codon, /^possible nndeotide deletion, 
V*pos$lble nucleotide Insertion 










DLDKSTGRYRDE V SLKTAIMSFINA VLSQG AG VE 

SlJDFRlJiLRYEVFLMLGIHPVMDKLRKHENSTLD 

RHLDFFE^RNEDELEFAKRFELVHIDTKSATQM 

FELTRKRLTHSEAYPHFTvISIlJfflCLQMPYKRSGN 

TVQYWLLLDIUIQQIVIQNDKGQDPDSTPLENFNI 

KNVVRMLVNENEVKQWKEQAEKMRKEHNELQ 

QKLEKKERECDAKTQEKEEMMQTimMKEKLE 

KETTEHKQVKQQVADLTAQLHELSRRAVCASIP 

GGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGMLPP 

PPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALK 

KKSnKJPTNALKSFNWSKLPENKLEGTVWTEEDD 

TKVFKILDLEDLERTFSAYQRQQDFFVNSNSKQK 

EADAIDDTLSSKLKVKELSVIDGRRAQNCNILLS 

RLKl^^EIKRAILTMDEQEDLPKDMLEQUJCFV 

PEKSDIDLLEEHKHELDRMAKADRFLFEMSRINH 

YQQRUJSLYFKKKFAERVAEVKPKVEAIRSGSEE 

VFRSGAIJCQL1JEVV1AFGNYMNKGQRGNAYGF 

KISSLNKIADTKSSIDKNITLLHYLITIVENKYPSV 

LNI^JEELRDIPQAAKVNMTELDKEISTLRSGLKA 

VETELEYQKSQPPQPGDKFVSWSQFITVASFSFS 

DVEDLLAEAKDLFTKAVKHFGEEAGK1QPDEFF 

GIFDQFLQAVSEAKQENENMRKKKEEEERRARM 

EAQLKEQRERERKMRKAKENSEESGEFDDLVSA 

IJRSGEWDKDI^KLKRNRKRITNQMTDSSRERPI 

TKLNF 


3859 


A 


1279 


141 


RVEHLSEFLVDDCPSLTFDVPLLDPYGPAGSDPS 
IJEFLWSEETYRGGMAINRFRLENDLEELALYQI 
QLLKDLRHTENEEDKVSSSSFRQRMLGNLLRPPY 
ERPELPTCLYVIGLTGISGSGKSSIAQRLKGLXjAF 
VIDSDHIXHRAYAPGGPAYOPVVEAFGTDILHK 
DGHNRKVLGSRWGNKKC Z'hT^^JPn \T 
REEMDRAVAIv ;GivVCVU>i*iAVLLEAGWQN.; . V H j 
EVWTAVIPEIEAVRRIVERDGLSEAAAQSRLQI'C | 
MSGQQLVEQSHWLSTACGSRISPNARWRKPGPS t 
CRSAFPRLIRPSTEKFSVGPDWLLELTSDPVVRRN 
GGU^AHPGSGPEVQAILCRTWPGLVOTGSIJOTL 
VFGQH 


3860 


A 


1 


3881 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVO«mAIFrVDA 

KTTBILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDVVEAl^EEHMEADGHAAVVFGTVVDnSRS 

GEKPVSVWNDCRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

Tr^LSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKNTTFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINWLA 

GGHVVPRDEIRKLMESQDIFTGTQTBLIAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 



453 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 PCT/US01/04098 



SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Alanine OCysteine, D=Aapartic Add, 
E=€)utamlc Add, ^Phenylalanine, OGIydne, H=Hlstldine, 
Msoleudne, K=Lysint, b=Leudne, ^Methionine, 
N=AsparagJne, P^ProJJne, Q-Glutamine, R=Argiirine, S=Serinc, 
T«Tbreonine, V^Valine, W-Tryptophan, Y«Tyrosine, 
X=UnknowD, *=Stop codon, /^possible nndeotide deletion, 
\=possib!e nndeotide insertion 










EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACY ALATDLPGGLEAVEAQEVDVNSFS WNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVDDDRELLIXTGTCVDLXjQGRR 

FRESCV GHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVrVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VEL(^PTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TAVDKEKNKEVYVKF1K10SKVLEDCWIEDPKLG 

KVTLEIAILSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLKDnHRDIKDENmAEDrnm 

DFGSAAYIJ5RGKIJTTFCGTIEYCAPEVLMGNPY 

RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVMADYTWEEVFRVNKPESGVl^AAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3861 


A 


1 


3881 


MGQKSVGASYVQEPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSIXRGLSSGWSSPLLPAPVCNPNICAIFTVDA 

KTTBILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAWFGTWDIISRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

fflTDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLm r ^OTfflC^TH^AI.TLFGYGy T^LL 

GKNITF^' Cr,: 'MDLAYNSi>LQLPDLASCLi)V 

GNESGCGPJ".TLDPWQGQDPAEGGQDPRINVVLA 

GGHVVPRDETRKLMESQDIFTGTQTELIAGGQLL 

SO-SPQPAPGVDl>T/?EGSLPVHGEQALPKDQQrr 

ALGREEPVABESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CY GSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 

EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCV GHDPTEPLEVCLVSSEHYAASDRESPGH 

WSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVEIJEGLAACEGEYSQKYSTN1SPLGSGAFGFVW 

TAVDKEKNKEVVVTCFIKKEKVLErXIWIEDPKLG 

KVTI^IAII^RVEHAOTKVLDIFEN(^FFQL\^ 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRIJCDIIHRDIKDENIVIAEDFTIKLr 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of. 
peptide 
seqnenee 


Amino add sequence (A»Alanine OCystrine, D»Aspartic Add, 
E<=GIutamic Add, F-Phenytalanine, OCrydne, H^Histidine, 
Msoleudne, K=Lyrine, L^Leudne, ^Methionine, 
N=Asparagine, P=Proline> Q=*GIutamfDe, R=Arginmc, S^Serine, 
T-Threonine, V=Valine, W^ryptopban, Y=Oyroslne, 
X=Unknown, *=5top codon, /^possible nudeotide deletion, 
V=possible nudeotide insertion 










RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKEIJVISLVSGIJ^PVPERRTTIJEKLVT 

DPWVTQPVNLADYTWEBVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3862 


A 


399 


2069 


TMDRSKKNSIAGFPPRVEVRLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVrTCVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNrU^IIr^ILRYmSGNLEQIXDSNLHLP 

VHVRVKLAYDIAVGI^YLHFKGIFHRDLTSKNC 

LDCRDENGYSAWADFGLAEKIPDVSMGSEKLA 

WGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FUJLTFNCCNMDPKLRPSFVEIGKTLEEE^SRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRITWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRIXjAARTPKVNrTSARQDI^GGKIKFFDLPSK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3863 

i 


A 


399 


2069 

i' 


TMDRSKRNSIAGFPPRVE\RLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

rTSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDL\VGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

WGSPFWMAPEVLRDEPYNEKADVFSYGIDLCEn 

ARIQA DPDYLPRTENFGLDYDAFQHMVGDCPPD 

r^QLirrCCNMT ^KLP^v~-KTLLl-^f T 

EEQERDRKLQPTARGLi/.iw^ ^ /"KRLSSLDDKIP 

HKSPCPRRTEWl^RSQJSD^ARKI^RWSVIJDPYY 

RPRDGAARTPKVNPFSARQDL^^GGKIKFFDLPSK 

SVISLVFDLDAPGPGTMPIJU>V/QE?LAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3864 


A 


3 


911 


SWNMDSDSCAAAFHPEEYSPSCKRRRTVEDJFNK 

FCTFVLAYAGYIPYPKEELPLRSSPSPANSTAGTI 

DSDGWDAGFSDIASSVPLPVSDRCFSHLQPTLLQ 

RAKPSNFULDRKKTDKIJCKKKKRKJ^ 

EGYRGGLLKLEAADPYVETPTSPTLQDIPQAPSD 

PCSGWDSDTPSSGSCATVSPDQVKEIKTEGKRTI 

\WQEAQLMARNIX3NFSSLLESIFPS\DDDSWDLV 

TCTCMKPFAGRPMIECNECHTWIHLSCAKIRKSN 

VPEVFVCQKCRDSKFDIRRSNRSRTGSRKLFLD 


3865 . 


A 


3 


3573 


QERLRSRSRPDRAAREAGSARGRQPKRTERVEQ 

FLHARRRGRRSMPVSLEDSGEPTSCPATDAETAS 

EGSVESASETRSGPQSASTAVKERPASSEKVKGG 

DDHDDTSDSDSDGLTLKELQNRLRRKREQEPTE 

RPLKGIQSRLRKKRREEGPAETV GSEASDTVEG V 

LPSKQEPENDQGWSQAGKDDRESKLEGKAAQD 

IKDEEPGDLGRPKPECEGYDPNALYCICRQPHNN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=AIaoine OCysteine, D»Aspartle Add, 
E^GIuttmic Add, ^Phenylalanine, OGIydne, H=HIstidine, 
l^lsoieudne, K— Lysine, L^Leudne, M=Methioniae, 
N^Asparagine, f^rollne, Q=Clatamlne, R^Arginine, S=Scrine, 
^Threonine, V«Valine, W^Tryptophan, Y^Tyrosine, 
X=Unkno^m, *=Stop codon, /=posdble nndeotide deletion, 
\=possible nucleotide Insertion 










RFMICCDRCEEWFHGDCVGISEARGRLLERNGE 

DYICPNCmQVQDETHSETADQQEAKWRPGDA 

TCTIXn'SlGTIEQKSSEIX^IKGRIEKAANPSGKK * 

K1JOFQPGPGPWTQLPVLWQVLEIAVSRSISAFT 

LLHCISCKVIEAPGASKaGPGCCHVAQPDSVYCS 

>nXILKHAAATMKFLSSGKEQKPKPKEKMKMK 

PEKPSLPKCGAQAGIKJSSVHKRPAPEKKETTVK 

KAWVPARSEALGKEAACESSTPSWASDHNYNA 

VKPEKTAAPSPSLLYKSTKEDRRSEEKAAATAAS 

KKTAPPGSTVGKQPAPRNLVPKKSSFANVAAAT 

PA3KKPPSGFKGTBPKRPWLSATPSSGASAARQAG 

PAPAAATAA SKKFPGS AALVGA VRKPVVPSVPM 

ASPAPGRLGAMSAAPSQPNSQIRQNIRRSLKEIL 

WK/RFLFFILFRVNDSDDLIMTENEVGKIALrn^ 

EMFNLFQVTDN/RAYK5KYRSIMFNLKDPKNQG 

U^VLREEISIAKLVTOJO^EELVSKEI^TWK^ 

PARSVMESRTKLHNESKKTAPRQEAIPDLEDSPP 

VSDSEEQQESARAVPEKSTAPLLDVFSSMLKDTT 

SQHKAHLFDLNCKICTGQVPSAEDEPAPKKQKLS 

ASVKKEDLKSKHDSSAPDPAPDSADEVMPEAVP 

EVASEPGLESASHPNVDRTYFPGPPGDGHPEPSPL 

EDI^PCPASCGSGVVTTVTVSGRDPRTAPSSSCT 

AVASAASRPDSTHMVEARQDVPKPVLTSVMVPK 

SILAKPSSSPDPRYLSVPPSPNISTSESRSPPEGDTT 

LFLSRl^TIAVKGFTNMQSVAKFVTKAYPVSGCFD 

YLSEDLPDTIHIGGRIAPKTVWDYVGKLKSSVSK 

ELCLIRFHPATEEEEVAYISLYSYFSSRGRFGWA 

NNNRHVKDLYLIPLSAQDPVPSKLLPFEGPGKRR 

LSGWR 


3866 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRL^i:MRRI^!a^^ 2R? VBTlfO 

FNKTVEHGFPHQPU i\ LGYSPSLtilLAIGTRSGAIK : 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LU)DNSIJILWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITWLPHSSCELLYLGTESGNVFWQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQELIGYSRGLWTWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAC^PEPLRSLWYGPFPOCAITRILWLTTRQ 

G\LPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALWLAEEEL 

WIDLQTAGWPPVQLPYLASLHCSAJTCSHHVSN 

IPLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLIXTGHEIXjTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAWTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTTYAFSLRVPPAERRMDEPVRAE 

QAKE1QLMHRAPWGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLWSEEQFKVFTLPKVSAK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteJoe, D=Aspartie Add, 
BXilutamlc Add, ^Phenylalanine, G=Gtycine, H^Hbddine, 
fclsoteudne, K=Lysine, L^Leadne, M=Methionine, 
N=>Asparagine, P=Prollne, Q=GIutamine, R»Argjnjne, SWSerine, 
T^Threonine, V=Valine, W-Tryptophan, Y-Tyrosine, 
X«Unknown, *=Stop eodon, ^possible nudeotide deletion, 
^Fpossible nudeotide insertion 










LKLKLTALEGSRVRRV SVAHFGSRRAED YGEHH 

LAVLTNLGDIQWSLPLLKPQVRYSCIRREDVSGI 

ASCVFTKYGQGFYLISPSEFERFSLSTKGVLVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 

GEEKQPGLVMERALLSDERAATGWHIEPPWGA 

ASAMAEQSEWLSVQAAR 


3867 

i 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

F^TXHEHGFPHQPSAI^YSPSIJIII^IGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQILIGYSRGLWIWDLQGSRVLY 

HFLSSQQLENTWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCXAITRILWLTTRQ 

GNIJPFTIFQGGMPRASYGDRHCISVIHrXKJQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALWLAEEEL 

WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDIXLTGHEIXjTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAWTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAiLPvQLMHR VPVVGtt ' 7 t>GHSY> 5 ! PF™ F^ r \H 

DLL r KSPDMQGSHQL.. V ^L. WVPiLPKVSAK 

LKLKLTALEGSRVRRVS 7AHFGSRRAEDYGEHH 

LA VLTNLGDIQ VVSLPLLF PO VRYSCIRREDVSGI 

ASCVFTKYGQGFYLISPSEF£?iSLSTKG\LVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 

GEEKQPGLVMERALLSDERAATGWHEEPPWGA 

ASAMAEQSEWLSVQAAR 


3868 


A 


1 


2497 


GDSGGPLVCEEPSGRFFLAGIVSWGIGCAEARRP 

GWARVTRLRDWI1JBATTKASMP1APTMAPAPA 

APSTAWPTSPESPVVSTPTKSMQALSTVPLDWVT 

VPKLQECGARPAMEKPTRWGGFGAASGEVPW 

QVSLKEGSRHFCGATVVGDRWLLSAAHCFNHT 

KVEQVRAHLGTASLLGLGGSPVK1GLRRVVLHP 

LYNPGILDFDLAVLSASPIAFNKYIQPVCLPLAI 

QKFPVGRKCMISGWGNTQEGNATKPELLQKASV 

GHDQKTCSVLYNFSLTDRMICAGFLEGKVDSCQ 

VSGIKALYESELADARRVLDETARERARLQIEIG 

KIJIAELDEVNKSAKKREGELTVAQGRVKDLESL 

FHRSEVELAAALSDKRGLESDVAELRAQLAKAE 

IX5HAVAKKQLEKETLMRVDLENRCQSLQEELDF 

RKSVFEEEVRETRRRHERRLVEVDSSRQQEYDFK 

MAQALEELRSQHDEQVRLYKLELEQTYQAKLDS 

AKLSSDQNDKAASAAREELKEARMRLESLSYQL 

SGIXJKQASAAEDRIRELEEAMAGERDKFRKMLD 



457 



10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



PCT/US01/04098 



SCQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A«Atanfne OCystdnc, D=Aspartic Add, 
E^GIutamic Add, ^Phenylalanine, G=Glydne» H=Histidine, 
I=Isoleucine, K-Lysine, L=Leudnt, M=Methionine, 
N^Asparagine, P=ProHne» Q=Glutamine, R^Arginine, S-Serint, 
T-Threonine, V^Valtoe, W=Tryptophan, Y^Tyrosinc, 
X-Unknown, *=Stop codon, A-possible oncleotide deletion, 
V=possible nudeotfde insertion 










AKEQEMTEMRDVMQQQIj\EYQELU)VKIj\LD 

MEINAYRKLLEGEEERLKLSPSPSSRVTVSRATSS 

SSGSLSATGRLGRSKRKR\WRWRSPW\QRPKRPG 

HGHQWQRWLPPGPAGLGLGQRMnEElDLEGKFV 

QLK^SDKIXJSIXjNWRIKRQVLEGEEIAYKFrP 

KYILRAGQMVTVWAAGAGVAHSPPSTLVWKGQ 

SSWGTGESFRTVLVNADGEEVAMRTVKKSSVM 

RENENGEEEEEEAEFGEEDLFHQQGDPRTTSRGC 

YVM 


3869 


A 


1 


1942 


RYRAGIPGDGRKDYIRLTRPGLTLPGRAMFARGS 

RRRRSGRAPPEAEDPDRGQPCNSCREQCPGFLLH 

GWRKICQHCKCPREEHAVHAVPVDLERIMCRLIS 

DFQRHSISDDDSGCASEEYAWVPPGLKPEQVYQ 

FFSCLPEDKVPYVNSPGEKYRIKQLLHQLPPHDS 

EAQYCTAL\EE\EEKKELRAFSQQRKRENLG/RLG 

IVRIFPVTIT\GAI\CEECGKQIGGGDIAVRASRASL 

GLLLGQPSCIAVCTrCQELLVDLIYFYHVGKVYC 

GRHHAECLRPRCQACDEIIFSPECTEAEGRHWHM 

DHFCCFECEASLGGQRYVMRQSRPHCCACYEAR 

HAEYCDGCGEHIGLDQGQMAYEGQHWHASDRC 

FCCSRCGRALLGRPFLPRRGLIFCSRACSLGSEPT 

APGPSRRSWSAGPVTAPLAASTASFSAVKGASET 

TTKGTSTELAPATGPEEPSRFLRGAPHRHSMPEL 

GLRSVPEPPPESPGQPNLRPDDSAFGRQSTPRVSF 

RDPLVSEGGPRRTLSAPPAQRRRPRSPPPRAPSRR 

RHHHHNHHHHHhTRHPSRRRHYQCDAGSGSDSE 

SCSSSPSSSSSESSEDDGFFLGERIPLPPHLCRPMP 

AQDTAMETFNSPSLSLPRDSRAGMPRQARDKNC 

IVA 


3870 


A 


2 


3485 

« 


FVWRVFYVHASCMPPRARSWEGAHAPVGMHV 

AE. A mCfSOnQQMPPAQFmiLEWLLHIXAFl-t ; 

Tr^x^HWCCCSNPHGSIADKPHEIVT r V^'SRAAE 

NMAVEPRVATKQRPSSRCFPAGSDMNSVYERQ 

GIAVNflPTVPGSPKAPFI/iIPRGTMRRQKSIDSRI 

FLSGITEEERQFIJ^PNILKFTRSLSMPDTSEDIPPP 

PQSWPSPPPPSPTTYNCPKSPTPRVYGTIKPAFNQ 

NSAAKVSPATRSDTVATMMREKGMYFRRELDR 

YSLDSEDLYSRNAGPQANFRNKRGQMPENPYSE 

VGKIASKAVYVPAKPARRKGMLVKQSNVEDSPE 

KTCSIPIPTnVKEPSTSSSGKSSQGSSMEE)PQAPE 

PPSQLRPDESLTVSSPFAAAIAGAVRDREKRLEA 

RRNSPAFLSADLGDEHVGLGPPAPRTRPSMFPEE 

GDFADEDSAEQLSSPMPSATPREPENHFVGGAEA 

SAPGEAGRPLNSTSKAQGPESSPAVPSASSGTAG 

PGNYVHPLTGRLLDPSSPLALALSARDRAMKES 

QQGPKGEAPKADLNKPLYIDTKMRPSLDAGFPT 

VTRQNTRGPLRRQETENKYETDLGRDRKGDDK 

KNMLIDIMDTSQQKSAGIJLMVHTVDATKLDNA 

LQEEDEKAEVEMKPDSSPSEVPEGVSETEGALQI 

SAAPEPTTVPGRTTVAVGSMEEAVILPFRIPPPPLA 

SVDLDEDFIFTEPl^PPLEFANSroiPDDRAASVPA 

I^DLVKQKXSDTPQSPSLNSSQPTNSADSKICPAS 

LSNCLPASFLPPPESFDAVADSGIEEVDSRSSSDH 

HLETTSTISTVSSISTLSSEGGE1WDTCTVYADGQ 

AFMVTDKPPVPPKPKMKPnHKSNALYQDALVEE 
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SEQW 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 
sequence . 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A B Alanine OCysteine, D=»Aspflrtic Add, 
&*Glntamic Add, {^Phenylalanine, G=G!ydne, H»Hlstf dine, 
Msoleudnt, K«Lysine, L^Lendne, M^MetnionJne, 
N=Asparagine, P^ProIine, Q^Glutamine, R-Arginine, S^Serine, 
T=Threonlne, V»VaIine, W^Tryptophan, Y-Tyrosine, 
X=Unknown, *«Stop codon, /=possible nudeotlde ddetlon, 
V=possibie nudeotide insertion 










DVDSFVIPPPAPPPPPGSAQPGMAKVLQPRTSKL 

WGDVTEIKSPILSGPKANVISELNS1LQQMNREKL 

AKPGEGLDSPMGAKSASLAPRSPEIMSTISGTRST 

TVTFTVRPGTSQHTLQSRPPDYESRTSGTRRAPS 

PWSPTEMNKETLPAPLSAATASPSPALSDVFSLP 

SQPPSGDI^GLNPAGRSRSPSPSILQQPISNKPFTT 

KPVHLWTKPDVADWLESLNLGEHKEAFMDNEI 

DGSHIJmQKEDLIDLGVTRVGHRMNIERALKQ 

LLDR 


3871 


A 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEE V ANKVSCAMTDEICRLS VL VDEFC SE 

FHPNPDVLKIYKSELNKHIEDGMGRNLADRCTD 

EWALVl^TQQEIIENLKPLLPAGIQDKLHTLIPC 

KKJT3LSYNLKYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TPATPDNASQEELMITLVTGLASVTSRTSMGIIIV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKLRMT/SSTSANCSHQ 

VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 

QLEKIQNNSKLLRNKAVQLENELENFTKQFLPSS 

NEES 


3872 

: 


A 


35 


1171 


VESRSAWHEGEDQIDRLDFERNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNKHffiDGMGRNLADRCTD 

EVNALVLQTQQEEENLKPLLPAGIQDKLHTLIPC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TPATPDNASQEELMITLVTGLASVTSRTSMGniV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 

VKQQIATTFARLC* VDJTQKOLEFJ 71 A'^ T ^EED j 

QLEK; ^ Ti -SKIJLJCvfKAV^ l *SS ' 

NEES 


3873 


A ■ 


2944 


2089 


PVCTALTPGRMTDDKDVLRDVWFGRL" x'-FTLY 

QDElTEREAEPYYIIl^RVSYLTLVTDKVKl^yv'^ 

KVMRQEDISEIWFEYEGTPLKWHYPIGLLFDLLA 

SSSALPWNITVHFKSFPEKDLLHCPSKDAIEAHF 

MSCMKEADAIJKHKSQVINEMQKKDHKQLWMG 

LQNDRFE>QFWAINRKLMEYPAEENGFRYIPFRIY 

QTTTERPnQKLFRPVAADGQLHTLGDLLKEVCP 

SAIDPEIXjEKKNQVMIHGIEPMLETPLQWLSEHL 

SYPDNFLHISnPQPTD 


3874 


A 


776 


366 


QARGAPSSPMCPLPLAAAAVAAPRAPLRLLNRG 

LAAAMSTAQSLKSVDYEVFGRVQGVCFRMYTE 

DEARKIGWGWVKNTSKGTVTGQVQGPEDKVN 

SMKSWI^KVGSPSSRIDRTNFSNEKTISKLEYSNF 

SIRY 


3875 


A 


1081 


182 


SJ^SCQTDPRPMSAPLDAALHALQEEQARLKMR 

LWDLQQLRKELGDSPKDKVPFSVPKIPLVFRGHT 

QQDPEVPKSLVSNLRIHCPLLAGSALITFDDPKVA 

EQVLQQKEHTINMEECRIJIVQVQPLEI^MVTTIQ 

VMVSSQLSGRRVLVTGFPASLRLSEEELLDKLEIF 

FGKTRNGGGDVDVRELLPGSVMLGFARDGVAQ 

RLCQIGQFTVPLGGQQVPLRVSPYVNGEIQKAEI 

RSQPVPRSVLVLNIPDILDGPELHDVLEIHFQKPT 
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SEQ1D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nodeotide 
location 
corresponding 
to test amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alanlne OCysteine, D=Aspartk Add, 
E=Glutamic Acid, F-Phenylnlanine, G=Glycine, H=>Htstidlne, 
l=Isolendne, K»Lysine, L»=Lendne, M=Methlonine, 
N=Asparagtne, rVProline, Q=€latamlne, R=ArginJne, S=Serine, 
l^rnreonJne, Y= Valine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *=*Stop codon, /=possible nudeotide deletion, 
V=possib1e nudeotide insertion 










RGGGEVEALTWPQGQQGLAVFTSESG 


3876 


A 


26 


431 


RMMKCPQALIJVIFWLLLSWVSSEDKVVQSPLSL 
VVHEGDTVTLNCSYEVTbnilSLLWYKQEKKAPT 
FIJ ? MLTSSGIEKKSGRLSSILDKKELSSILNITATQ 
TGDSAIYLCAVEAQCSLVTCSLYSNSTAEALQL 


3877 


A 


3 


1291 


KAFRLLAERGAAAAMLWSGCRRFGARLGCLPG 

GLRVLVQTGHRSLTSCIDPSMGLNEEQKEFQKV 

AFDFAAREMAPNMAEWDQKELFPVDVMRKAA 

QLGFGGVYIQTDVGGSGLSRLDTSVIFEALATGC 

TSTTAYISIHNMCAWMIDSFGNEEQRHKFCPPLC 

TMEKFASYCLTEPGSGSDAASLLTSAKKQGDHYI 

LNGSKAFISGAGESDIYWMCRTGGPGPKGISCIV 

VEKGTPGLSFGKKEKKVGWNSQPTRAVIFEDCA 

VPVANRIGSEGQGFLIAVRGLNGGRINIASCSLGA 

AHASVILTRDHLNVRKQFGEPLASNQYLQFTLA 

DMATRLVAARLMVRNAAVALQEERKDAVALCS 

MAKLF ATDECF AICN QALQMHGG YG YLKD YA V 

QQYVRDSRVHQILEGSNEVMR1LISRSLLQE 


3878 


A 


10 


1014 


LPGSTISSSGCQAPGRADSSGGARNSRRGDSRPG 

SCNRQAVAPPCPSPGPQSRHWIHRGTAPQAGETR 

TLGRGSSAI^ACSASVTrcC^SSPPS*SCL*PTRRS 

PQNSSSTEVYRGFWQHGLPST**PFSS*QWPGQH 

TQGCSKLLGKQTTHLPCSTWPA* *PSPSCLTRFR* 

W*PSLMCLWASSCSVCV*SPSGSCRH*LWGTHST 

SRTC*ARRSSALPTGLCTDDTSWASSSKARPCAL 

QRPSSLSSLSPCLTC*W*LSSSSPMSARSPAGAET 

GSWATGSPRLTQWKSSRLTSTSHSARSAWKPSA 

TESTPSWPRFSSWTSGEDPASPAPAI 


3879 

i . . 


A 


200 


699 


LLLTGYIQTLQNQQLSGNQQEMQAVDNLTSAPG 
NTSLCTODYKITQVLFPLLYTVLFFVGLITNGLA 
tolWFQTRS v SOTIIrT- ^ 

DAKLG iGPLRTFVCQVTSVlFYF; -^JISFLiiL-^T 
IDRYQKTTRPFKTCNPKhnXGAKJUC 4 


3880 


A 


26 


169 


QPETOIMVHLTPEEKSAWALWGKVNVDEDAG 
DDLCQILVDRPRLRI 


3881 


A 


37 


1100 


tplfdfwpgfvlswlqplsaslrarraasgppac 

rmpttvddvijbhggef 

fapiyvgivflgftpdhrcrspgvaelslrcgwsp 

aeelnytvpgpgpageasprqcrryevdwnqst 

fdcvdpiasldtnrsrh>lgpcri>^ 

ivtefnlvcanswmldlfqssvnvgff1gsmsig 

yiadrfgrkix:llttvlinaaagvlmaisptytw 

mlifruqglvskagwugyilitefvgrryrrtv 

gde^qvaytvgixvlagvayalphwrwlqftv 

alpt^ftfllyywcepesprwlisqnknaeaniriik 

hiakkngkslpasl 


3882 


A 


573 


1620 


KSKCRFPEGLSEGFGPMRKEALSSGSVQEAEAM 

LDEPQEQAEGSLTVYVISEHSSLLPQDMMSY1GP 

KRTAVVRGIMHREAFNIIGRRIVQVAQAMSLTED 

VLAAALADHLPEDKWSAEKRRPLKSSLGYEITFS 

LLNPDPKSHDV Y WDiiiuA VKKY VQPFLNALGAA 

GNFSVDSQILYYAMUjVWRFDSASSSYYLDMH 

SLPHVINPVESRLGSSAASLYPVLNFLLYVPELAH 

SPLY1QDKDGAPVATNAFHSPRWGGIMVYNVDS 

KTYNASVLPVRVEVDMVRVMEVFLAQLRLLFGI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteint, D=Aspartic Add, 
E=G!ntamlc Add, F=Pbeoy!ajanJne, G^dycine, H=Htetidine, 
I=IsoIeudne» K s Lysine> l^Leudne, M^Methlonine, 
N=Asparagine, P^ProHne, Q^GIutamine, R^Arginine, S^Serine, 
IKThreonine, V-Vallne, W«Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, ^possible nudeotide deletion, 
\=possible nudeotide insertion 










AQPQLPPKCLLSGPTSEGLMTWELDRLLWARSV 
ENLATATTTLTSLA 


3883 


A 


2369 


844 


RIHREEDFQFILKGIARLLSNPLLQTYLPNSTKKIQ 

FHQELLVLFWKLCDFNKVGQPRGALQGDGEQLP 

Q*PGGRDSVRLRGVGQSCPSLELSPLGPSPHP*KF 

UTVLKSSDVLDILVPILFFLNDARADQSRVGLM 

HIGVFIL11XSGECNFGVRLNKPYSIRVPMDPVF 

TGTHADLLmVFHKDTSGHQRLQPIJTWXLTIVV 

NVSPYLKSLSMVTANrOXHLLEAFSTrWFLFSAA 

QNHHLVFFLIJEYFNNnQYQFDGNSNLVYAIIRKR 

SIFHQLANLPTOPPTIHKALQRRRRTPEPLSRTGS 

QGGAPPWRAPAPLPLQSQAPSRPVWWLLQALTS 

♦PRSPRCQRMAPCGPWNLSPSRAWRMAARLRGS 

PARHGGSSGDRP/HSSASGQWSPTPEWVLSWKS 

KLPLQTIMRLLQVLVPQVEKICIDKGLTDESEILR 

FLQHGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGVnXRl^VDPPVWYDTDVKLFEIQRV 


3884 


A 


1 


804 


NGPRAPFSQEGQSTGPPPLIPRLGQHGAQGRIPPL 

NPGQGPGPNKDDSRGPPNHHMGPMSERRHEQSG 

GPEHGPERGPLRGGQDCRGPPDRRGPHPDFPDDF 

SRPDDFHPDKRFGHRLREFEGRGGPLPQEEKWR 

RGGPGPPFPPDHREFSEGDGRGAARGPPGAWEG 

RRPGG* TFPPG SRGPTFS/SG AEEESFRRG APPRHE 

GRAPPRGRDGFPGPEDFGPEENFDASEEAARGRD 

LRGRGRGTPRGERVTKDTWSGRIGCElIrrWL 


3885 


A 


3 


996 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 

TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNORi .- iVAIIN ,XjW~s? :. VM 

' ; I QHPGLN.\HGAAQMQPMi^ Y O v o. LQYNSM 

TSSQTYMNG/SRPTYSMSYS^'/TPGMAPGSVMG 

SVVKSEASSSPPWTSSSHSRAPCQ A ODLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGP'/PGTAI 

NGTLPLSHM 


3886 


A 


773 


317 


QCIXJKAAEGYTQFYYVDVLDGKIACVNKCTKG 
TKSQMNC^GTCQLQRSGPRCI^PNTNTHWYW 
GETCEFNIAKSLVYGIVGAVMAVLLLALIILIILFS 
LSQ\RKRHRPESEGEADFGLENATNNFG\PTLBTV 
DSGTELHIQ\RPEMVASTV 


3887 


A 


3 


466 


\^FRVKT1XVDNKCFVLQLWDTAGQERYHSMT 

RQLLRKAIXjVVLMYDITSQESFAHVRYWLDCL 

QDAGSDGVVILLLGNKMDCEEERQVSVEAGQQL 

AQELGVYFGECSAALGHNILEPVVNLARSLRMQ 

EEGLKDSLVKVAPKRPPKRFGCCS 


3888 


A 


3412 


3144 


QNIDITNFSSSWNDGLAFCALLHTYLPAHIPYQEL 

NSQDKRRNFMIAFQAAESVGIKSTLDINEMVRT 

ERPDWQNVMLYVTAIYKYFET 


3889 


A 


1 


1160 


lvwaitaiijvfpneytrmstseliselfnix:gll 

dssklcdyenrfntskggelpdrpagvgvysam 

wqlaltldjkivitifttgmkipsglfipsmavgai 

agrllgvgmeqlayyhqewtvfnswcsqgad 

citpglyamvgaaaclggvtrmtvslvvimfel 

tggleyivplmaaamtskwvadalgregiyda 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A<=Aianine OCysteine, D»Aspartic Add, 
E=G)Dtemic Add, ^Phenylalanine, OGIydne, H-HJstidine, 
I=Isoleucine, K«Lysine, LHUudne, M^Methlonine, 
N-Asparagine, P°Protine, Q=Glutamine, R-Arginiae, S==Serine, 
T-Threonine, V=Valine, W«Tryptophan, Y«=Tyrosine, 
X=Unknown, * c =Stop codou, ^possible nudeotide deletion, 
V=possible nudeotide insertion 










HnaNGYPFLEAKEEFAHKTLAMDVMKPRRNDP 

LLTVLTQDSMTVEDVETnSETTYSGFPVVVSRES 

QRLVGFVlJRRDLnSIENARKKQDGWSTSnYFTE 

HSPPLPPYTPPTLKIJRNII^^ 

DIF1(KLG1^QCLVTHNGRLLGIITKKDVLKHL\^ 

MANQDPDSILFN 


3890 


A 


I 


387 


SWCWTGIFVLGTTNLRLEGSWYRSLWGPGFNTT 
TATLGFGAPQAPVGDVALNQPDMCVYRRGRKK 
RWYTKI^HCELENEYAINKF^ 
NLSERQVTTWFQNRRVKDKKIVSKLKDTVS 


3891 


A 


2 


2914 


RGGGGDHKMADLSLLQEDLQEDADGFGVDDYS 

SESDVniPSALDLAST/QDEMVERPLGRL\DK\YA 

ASENHI*PDKMVAPEFASIPLRE\VCDDERDCIAV 

LGKN*PDWADDSEPT\VRAAELEQVPHIALFLFK 

KTRLSITICFPSKFLLPYCGLDTLADQN\NQVRKT 

SQAALLNALLEQELIERFDVETKVCPVLIELTAPDS 

NDDVKTEAVAIMCKMAPVMVGKDITERLILPRFC 

EMCCIXIRMFH\VTUOVCAANFGDICSVVGQQAT 

EEMUJRFFQLCSDNVWGVRKACAECFMAVSC 

ATCQEIRRTKLSALFINLISDPSRWVRQAAFQSLG 

PFISTFANPSSSGQYFKEESKSSEEMSVENNKRTR 

DQEAPEDVQVRPEDTPSDLSVSNSSVBLENTMED 

HAAEASGKPLGEISVPLDSSLLCTLSSESHQEAAS 

NENDKKPGNYKSMLRPEVGTTSQDSALLDQELY 

NSFHFWRTPLPEIDLDIELEQNSGGKPSPEGPEEE 

SEGPVPSSPOTTMATRKELEEM1ENLEPHIDDPDV 

KAQVEVLSAALRASSLDAHEETISIEKRSDLQDE 

LDINELPNCKINQEDSVPLISDAVENMDSTLrlYIH 

NDSDLSNNSSFSPDEERRTKVQDVVPQALLEKJY 

LSMTDPSRAQTVDTEIAKHCAYSLPGVALTLGR 

C^TOCLitETr FTl^ASDMQW- l^RTv^. /VFSUJi^ A 

VILv;3>\QLTAA^ 

\ JIDFIJQXHIDKRREYLYQIXJEFLVTDNSRhrVVR 

r^\HLAEQLILlJLELYSPRDVYDYLRPIALNLCAD 

KVSSVRWISYKLVSEMVKK1JHAATPPTFGVDLIN 

ELVENFGRCPKWSGRQAFVFVCQTVIEDDCLPM 

DQFAVHLMPHIXTLANDRVFN^ 

LLEKDYFLASASCHQEAVEQTIMALQMDRDSDV 

KYFASIHP ASTKI SEDAMSTASSTY 


3892 


A 


158 


2191 


VPLPAPSGLSGGGSRGAGCKKAPPGRAPAPGLAP 

1JRPSEPTMAVPPGHGPFSGFPGPQEHTQVLPDVR 

LLPRRIJ'LAFRDATSAPLRKLSVDLIKTYKHINEV 

YYAKKKRRAQQAPPQDSSNK1GEKKVLNHGYDD 

DNHDYIVRSGERWLERYEIDSLIGKGSFGQVVKA 

YDHQTQELVAIKIIKNKKAFLNQAQIELRLLELM 

NQHDIEMKYYIVHLKRHFMFRN\HLCLVFELLS 

YNLYDLLRNTHFTIGVSLNLTRKLAQQLCTALLF 

IATPELSimCDLKPENILLCNPKRSAIKIVDFGSS 

CQLGQRIYQYIQSRFYRSPEVLLGTPYDLAIDMW 

SLGCILVEMHTGEPLFSGSNEVCPQEGVDQMNRI 

VEV1X3IPPAAMLDQAPKARKYFERLPGGGWTLR 

RTKELRKDYQGPGTRRLQEVLGVQTGGPGGRRA 

GEPGHSPAD\Y\LRFQDLVLRMLEYEPAARISPLG 

ALQHGFFRRTADEATNTGPAGSSASTSPAPLDTC 

PSSSTASSISSSGGSSGSSSDNRTYRYSNRYCGGP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end . 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartie Acid, 
E=Glutamic Acid, F^Pbenylalanlne, G=Glydnt, H=Histidine, 
I-Isoleudne, K=Lysine, L^Leucine, M^Methionine, 
N^Asparagrae, P=Proline, Q=GlntamIne, R=Arginine, S=Serine, 
T=Threonine» V«Valine, W^ryptophan, Y=1>Tosine, 
X=Unknown, *«Stop codon, /=possible nucleotide deletion, 
\=possiblc nucleotide insertion 










GPPITDCEMNSPQVPPSQPLRPWAGGDVPHKTH 

QAPASASSLPGTGAQLPPQPRYLGRPPSPTSPPPP 

ELMDVSLVGGPADCSPPHPAPAPQHPAASALRT 

RMTGGRPPLPPPDDPATLGPHLGLRGVPQSTAAS 

S 


3893 


A 


68 


258 


PEEYYPFSPTLQQLFFFLLDSDMGSRPESMGCRK 
NTWRPASPTEAGTDPQTFLHTWVSECRD 


3894 


A 


1120 


136 


SLPLAPAPAVAGPVALCPAGLCPAQPGMPAGPA 

AASGSHPEVGSVLQRSSQPHWPNPWPGAGHLPP 

PAGPFPYNPPAGPGAAAGLA*SPPRSSPTPCSVGP 

QSCPANASAPPAQPCLAGAPPAASLPPPGPGSVS 

AAPAPGGPAPAEPPLGVPPVPAWLLPDSPPLPGT 

HSGPPPAAVSLPPAAAACPVWPPPLPHHPPDLES 

PSAAAPNPGCAGGIRHFPPGSPEASSPLRPAAAPA 

LLPLPRPPS*P/VPWKPLHSPVAVAGGSFVAGGSV 

LPAPDLDQPRPSGPPAASPTPGPGVAQPPPGSAVL 

PTVP*APPVSGAAPGRKREW 


3895 


A 


2 


1347 


FGAVSYI^GNGSCWVKVTASSDLSDLISCLCPPR 

SLCSSQACVLPVPGPSLLLPQGLHVGCASAGTRW 

PLSCSIDFQRLLAHEEETQKRRAKESGMAFTQLT 

FRDVAIEFSQDEWKCLNSTQRTLYRDVMLENYR 

NLVSLDLSRNCVIKELAPQQEGNP/ARSIPHSDIGT 

T*KT* H* RVLLQGN QEKNTRL*LS VER* *KKLQQ 

SDYGPKRKSYL*ERPTR*KRYRKQVY*TSA\*LSF 

LPHPHELQQFQAEGK1YECNHVEKSVNHGSSVSP 

PQHSSTIKTHVSNKYGTDFICSSLLTQEQKSCIRE 

KPYRYIECDKALNHGSHMTVRQVSHSGEKGYKC 

DLCGKVFSQKSNIARHWRVHTGEKPYKCNECD 

RSFSRNSCLALHRRVHTGEKPYKCYECDKVFSR 

NS<XALJIQKTHIGEKPYTCKECGQAFSVRSTLTN 


3896 


A 


202 




mVqscsayGv .^wA.*: : a. ^j>kpvsfhkfpltrpslc 
keweaavrrki^;cfrkyssicsehftpdcfkrec 
nnkllkenavfi ii7 .ctephdkkedllepqeq 


3897 


A 


2 


382 


SHGLSRAPHLSAAPA?AL.\SRPCFSSAPCSQGGG 
GGGPATMIHFILLFSRQGKLRLQKWYITLPDKER 
KKITREWQIII^RGHRTSSFVDWKEUaVYKRYA 
SLYFCCAIEXNQDNELLTLENVHR 


3898 


A 


718 


305 


SEQEPLLGDTPGSREWDILETEEHYKSRWRSIRIL 
YLTMFLSSVGFSWMMSIWPYLQKIDPTADTSFL 
GWVIASYSLGQMVASPIFGLWSNYRPRKEPLIVSI 
LISVAANCLYAYLHIPASHNKYYMLVARGLLGIG 


3899 


A 


24 


718 


FRGRPGIPEREGKGNHSFVEVARVIWDLHSRLG 
GAMAERKGTAKVDFLKKIEKEIQQKWDTERVFE 
VNASM-EKQTSKGKYFVTFPYPYMNGRLHLGHT 
FSLSKCEFAVGYQRLKGKCCLFPFGLHCTGMPIK 
ACADKLKREEELY/GCPPDFPDEEEEEEETSVKTE 
DIIIKDKAKGKKSKAA/AKAGSSKYQWGIMKSLG 
LSDEEIVKFSEAEHWLDYFNALAIQDLKRMG 


3900 


A 


360 


1 


VPATSSNVSPSSSESSEPDLSSRSSSSDAPSSSPSVP 
SPCSL SLS SPESPLLPTLLSSKSP AG S AGPTCGCPS 
GPGLRATA/PSRLSSSIAAH/SSSAPETSRPAAARE 
RSPPLHDRESHE . 


3901 


A 


193 


345 


GEWAVPPAPGGQGVS1PHGPEPGQGSGVHIAPRQ 
GEGSDRTEPLICPKAAP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

hi first ftminn 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 

firlfl i mi fin » nrf 

peptide 
sequence 


Amino acid sequence (A s Alanlne OCysteioe, D* 5 As parti c Add, 
E=GlutamIc Add, F=PbenylaIanine, G=Glydne, H^Histidint, 
I=Isoleudne, K=Lysine, LHLeuclnc, M=Methionlne, 
N=Asparagine, ^Proline, Q=Glutamlnt, R=Arginint, S=Serine» 
T»Threonine, V«Va!inc, W«Tryptophan, Y=Tyrosine, 

A UQKDOtVUj ^OlUp CUUtJllj /^pQSSlDlC UUCICUUUC UOCQOIly 

\=possible nudeotlde Insertion 


3902 


A 


1188 


1389 


NPAARSAAAREGSPALPPPPVS/SSSGLGLLLPLSP 
PGSHAANPALSPRAPHSHYRPRPRCGPRRRPR 


3903 


A 


63 


396 


NNMR>n > HlJSSNHYLNLARTETVFARMESVKQRI 
IJU>GKEGLKNFAGKSIX}QIYRVLEKKQDTGETIE 
LTEDGKPL*WERKAPIXDCTCFGLPRRYIIAIMS 
GLGFCISFG 


3904 


A 


732 


1046 


AMSECPLILYIHKHIDTYSQSYLFNDLFYPVYSGG 
RMVTYEHLREVVFGKSEDEHYPLW*VLFGK*YA 
VAPNALMFmFM*NCTFVPKLP*VMDLK**LQYK 
SR 


3905 


A 


46 


910 


QPPPPPPPPPSPPPPPFPPARALSHLRLHPDACLFPS 

PFPLPCSTMPGMMEKGPELLGKNRSANGSAKSP 

AGGGGSGASSTOGGLHYSEPESGCSSDDEHDVG 

MRVGAEYQARIPEFDPGATKYTDKDNGGMLVW 

SPYHSIPDAXLDEYIAIAKEKHGYNVEQALXjMLF 

WHKHNTEKSLADUWIPFPDEWTVEDKVLFEQ 

AFSFHGKSFHRIQQMLPDKTIASLVKYYYSWKK 

TRSRTSLMDRQARKIA^TO^NQGDSDDDVEET^IP 

MDGNDSDYDPKKEAKKBGMS 


3906 


A 


2 


513 


KV(^CCSQEJL£TSinrYVDKNINLEQR^SSPSAK 
GHNHPGELGWENPNEWSQEAAISLISEEEDDTSS 
EATSSGKSJDDYGFISAILFLVTGILLVIISYIVPREV 
TVDPNTVAAREMERLEKESARLGAHLDRCVIAG 
LCLLTLGGVELSCLLMMSMWKGELYRRNRFAS 


3907 


A 


71 


412 


ILIMSNCLQNFLKITSTRLLCSRLCQQLRSKRKFF 
GTVPISRLHRRVVITGIGLVTPLGVGTHLVWDRLI 
GGESGIVSLVGEEYKSIPCSVAAYVPRGSDEGQF 
NEQNFVSKSD 


3908 

t 

*. 


A 

• 


77 


746 


LGTLLGWRAPLFSRCLAFHSPFILLNTPKLVKTAE 

LPPDRNYVLGAHPHGIMCTGFLCNFSTESNGFSQ 

Lf^UlP\VI^VL.V3LFYT-?VYRDY^ * 

RQSLDFn.SQPQLGQAYY^ 

HCLTLQKRKGFVRLALRHGASLVPVYSFGENDIF 

RLKAFATGSWQHWCQLTFKKLMGFSPCIFWGR 

GLFSATSWGLLPFAVPITTV 4 


3909 


A 


1 


793. 


FRAAGRPAAAMGDIPVVGLSSWKASPGKVTEAV 

KEAIDAGYRHFIKIAYFYHNEREVGAGIRCKIKE 

GAVRREDLLIATKLWCTCHKKSLVETACRKSLK 

ALKLNY1JDLYLIHWPMGFKPPHPEWIMSCSELSF 

CLSHPRVQDLPLDESNMVIPSDTDFLDTWEAME 

DLVnXJLVKhnGVSNFNHEQLERIXNKPGLRFKP 

LTNQIECHPYLTQKNLISFCQSRDVSVTAYRPLG 

GSCEGVDLmNPVIKRIAKEHGKSPAQILI 


3910 


A 


202 


705 


KK1MHRKKVDNRIRILIENGVAERQRSLFVVVGD 

RGKIXJWILHHMLSKATVKARPSVLWCYKKEL 

GFSSHRKKRMRQLQKKKNGTLNIKQDDPFELFI 

AATbnRYCYYNETHKILGNTFGMCVLQDFEALTP 

NLLARTVETWGGGLVVILLRTMNSLKQLYTVT 

M 


3911 


A 


3 


723 


AGRGARAAGEGGGPFKSRPRPLPSSRSLPAVGGG 

RYGADKMAAGGAVAAAPECRLLPYALHKWSSF 

SSTYLPENILVDKPNDQSSRWSSESNYPPQYLILK 

l^RPAIVQNITFGKYEKTHVCNLKKFKVFGGMN 

EENMTELLSSGLKNDYNKETFTLJKHKmE^ 

RFDOVPLLSWGPSFNFSIWYVELSGIDDPDIVQPC 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

tn flnt amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 

IUU ICWUIIC Ul 

peptide 
sequence 


Amino acid sequence (A«Alanlne OCystdne, D^Aspartic Add, 
E«Glntamk Acid, ^Phenylalanine, G^Clydne, H-Histidine, 
f=Isoleutine, K^Lysine, L=Leudne, M^MethJonhie, 
N=Asparagine, P^Proline, Q=GIutamine, R=-Arginint, S^Serine, 
T^Threonine, V«VaHne, W^Tryptophan, Y^Tyrosine, 

A*~UDlkIIUW!)f —sjiujj tuuuu r / JJVSAJI/J C nuCJcvUuc Q6JCUOD) 

\-possiWe nudeotide insertion 










LNWYSKYREQEAIRIXXKHFRQHhrnEAFESLQ 
KKT 


3912 


A 


2 


461 


FEKKQLRRPSLFLLGCCSFGIMAPSLWKGLEGIG 

LJALAHAAFSAAQHRSYMRLTEKEDESLPIDIVL 

QTUjVFAVTCYGIVHIAGEFKDMDATSELKNKTF 

DTVRNHPSFYVFNHRGSEYFSGPSDTANSSNQDA 

LSSNTSLKLRKLESLRR 


3913 


A 


362 


20 


APGRPEAKVPERSRESGSRRVRGPLLQLRPGRTS 
RPASGRGRGGAGGSYGKMRKPDSKIVLLGDMN 
VGKTSLLQRYMERRPPDTVSTVGGAFYLKQWRS 
YNISIWDTAGEAGAA 


3914 


A 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRI^ESLHVVDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKST1XNEKHIJCKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRKERKLSVLGKDGKPVSEYTIKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEWHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SBKQRKSKVEDKPFEBTGVEPVLETASSSAHSTQ 

/ADS SHR/ JCLH ^KI^T^SDKD^TST^ FRKLSD 

OHKSRSLKHSS; :\ jZZL ^ENKSDDKDGKEVDSS 

HEKARGNSSIMbI<i:;LSRRUSNRRGSI^QEMAK 

GEEKIAANTLSTPSo?SJ,QRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSk'i■QI;^^tNNNSHQDIDSEN 

MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMfflQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKMJChTTAAEEHVAQGDATIJEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSAl^NQSLTVRESEVIJCTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIYENENITKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDISmVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

IEADEGLHGTHSRNOTIJWGAEASECTVFAAAEE 

GGAWTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGlixi'lSEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV * 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGIIEDGEGPASCTGSEDSSEGFAISSESEENGESA 
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&EQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=C1 atomic Add, F-Pbenylalaoine, C=Clydne, H=Histidtae, 
I^Isolencine, K^Lyslne, L^Leudne, M=Methionine, 
N°Asparaglne, ^Proline, Q=Glutaraine, R^Arginine, S^Serine, 
T^Threonine, V^Valine, W«Tryptophan, Y^^Tyrosine, 
X<=Unknown, *=«Stop codon, /^possible nodeotide deletion, 
V=possible nudeotide insertion 






■j 




MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTWEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITOQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

IEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGVWESE 

NERAGTVMEEKDG SGHSTSS VEDCEGPVSS A VP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SIDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSS VSSIRYL AA VNTGADCADDMPPVQ . 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMEPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

EKVCDIGlffiESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEILAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 

KPEQNDDL nKSCB 


! m5 


A 


1 ' 


.;545 


]rK}IRVGITSQTG1^3ha^^^^ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTC^PAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

L YAD VTDP VLCLGQKDPG VEGKHCEKEKISS SK 

EUQTraAKSEPSKPARRLSESLHVVDENKNK 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLKKI)DSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIDCTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEWHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E^Glntamic Add, ^Phenylalanine, G*=Glydne, H^Histidine, 
I^lsolcudne, K^Lysine, L^Lencine, M^Metnionine, 
N-Asparagine, P-Proiine, Q=Glutamiiie, R^Arginine, S=S trine, 
T^Threonlne, V^Vallne, W=Tryptophan, Y^Tyrosine, 
X=Unknown, **=Stop codon, A=possible nucleotide ddetion, 
V*possible nucleotide insertion 










MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMHIQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRES YDPD V1PLFDKRTVLEG STA STSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHWGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTL^PVKAGPATTTSSETRQSEVALPCTS 

IEADEGLnGTHSRNNPLHVGAEASECTVFAAAEE 

GGAVVTEGFAESETFLTSTKEGESGECAVAESED 

RAADUAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 

MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTWEHVEAEAGAAIMNANENN 

VDSMSGTEKG SKDTDICSS AKGIVESS VTSA VSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

USTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLIISTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMtSTSIGEEFELPISSATTIKCAESLQP 

VA < v, EERA 1 GF VMST \DFEGF:vC*SAF?.3AES? 

LA^T^'IEKDECALISTSIAEECEAS VSGVWI ; I 

1SF^j\GTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGrPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQD£D:O.TITRVEDLSDAAIISTSTAECMPISA 

SIDRHEENQLTADWEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHUNAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ 

GTVAEHSrXPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDySGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIG>mESPLNVLGGLKLKANIJCMEAYVPS 

EEEKNGEEJU^ESl^GGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 

KPEQNDDDTIKSQE 


3916 


A 


2 


773 


GPFGVLWPSAKPGPVTAVEARPPDASDPEGLRG 
GSPAPLLAPGPLDPSGRLHPAVSMMSYLKQPPYG 
MNGLGLAGPAMDLLHPSVGYPATPRKQRRERTT 
FIRSQUJVLEAIJFAKTRYPDIFMREEVALKIM^E 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A-Alanfne OCysteint, INAspartic Add, 
E>=Glutamic Add, F-Phenylalanine, OGlycine, IMUstidine, 
f^Isoieudne, K=Lysine, I^=Leudne, M=Metnlonlne, 
N=Asparag!ne, P»ProIlne, Q^Glntamlne, R^Arglnine, S=Serine, 
T«Threonint, V=Valine, W^Tryptopban, Y-Tyroslne, 
X=Unknown, *==Stop codon,/=pcssible nudeotide deletion, 
V=possible nucleotide insertion 










SRVQVWFKNRRAKCRQQQQSGSGTKSRPAKKK 
SSPVRESSGSESSGQFIPPAVSSSASSSSSASSSSA 
WAAAAAAGLWAKI^CPIJHDDFSLCVFIEENRLV 
SGSWARDIRSVEETDKSGYR 


3917 


A 


2 


776 


RNIPGRRFRPPGLRRIJLKGPHMPREPRGYRTRVP 

ALRELVPSSHAGSGASEHCQNNRQGSRQHRASR 

NVQAGGAIAPPRHLCGLCSRLHFIJCPDLSWAA 

PSRAGASVMALRKELLKSIWYAFTALDVEKSGK 

VSKSQIJR\a£Hhtt,YTVI^ 

IX3PVSS(^YMPYLNKYILDKVEEGAFVKEHFDE 

LCWTLTAKKNYRADSNGNSMLSNQDAFRLWCL 

FNFLSEDKYPL1MDPDEGEYLLKRYS 


3918 


A 


10 


318 


WQDLVCLGGSRAQEQKPLQQLWNAILLVAMLL 
CTGLWQAQRQASRQSQRELGGQVDLFKRRW 
RRLASLKTRRCRLSRAAQGLPDPGAETCAVCLD 
YFCNKQ 


3919 


A 


1 


204 


RVLTAINHTLKENLRKFYKGKKDKPLDLRPKKT 
RAMRRRLNMHEEh^KTKKQHRKERLYPLRKYA 
AKA 


3920 


A 


1 


654 


RCCRSFVAPLQEKWFGLFFLGAILCLSFSWLFHT 

VYCHSEGVSRLFSKLDYSGIALLIMGSFVPWLYY 

SFYCNPQPCFIYLIVICVLGIAAIWSQWDMFATPQ 

YRGVRAGVFLGLGLSGOPTLHYV1SEGFLKAATI 

GQIGWLMLMASLYITGAALYAARIPERFFPGKCD 

rWTOSHQLFHIFWAGAFVHFHGVSNLQEFRFMI 

GGGCSEEDAL 


3921 


A 


1587 


452 


LERDGCGGEEGGSVRSGAGPDSDPRGASSPPAG 

HRGTAASPRP VAAPS RTPA PPHTRARA SPGLPS G 

PAWRRVQWFSRVSGQVSTLMKATVLMRQPGRV 

QEIVGALRKGGGDRLQVISDFDMTLSRFAYNGK 

RCPSSYNILD^ISICIICEECRy^LTALL it, Y*> T ?,IT. 

FHRTVKEfOimWEW^^ 

AQVVRESNAMLREGYKTrTOTLYHNNlPLFIFSA 

GIGDILEEmQMKVFHPNMW 

QGrlCGQLIHTYNKNSSACENCGYFQQLEGKTNV 

311XjDSIGDLTMADGVPGVQNILKIGFLM)KVEE 

RRERYMDSYDIVLEKDETLDVVNGLLQHILCQG 

VQLEMQGP 


3922 


A 


2 


164 


GKIYQRAFGGHSLKFGKGVQAHGCCCVADRTG 
HSELHTSYGRERPAPVHLRQDT 


3923 


A 


2 


3258 


EHATHAYAKLGTRRRHREVTVFVPTWQLKKNR 
R\rt*ESHrT,TKUlSUC^^ 

YRFMVKLAEETDGHVIT^EQIHn^MNSSKKIMVX 

DRLLPFTFAGNLFMVPDDPLGRDGPTLDEFLKKP 

NRLDTDIGNFLKVWKTLPPSSASVTELSDDADSG 

PLESLPNMEEVREEKEERQDEEQRQGQGTQKAA 

EEDDLDSSLASVFRVECPSLSEEILRCLSLHDPPD 

GALDE)LLPGAASPYLGIPWDGKAPCQQVLAHL 

AQLT1PSNFTALSFFMGFMDSHRDAIPDYEALVG 

PLHSLLKQKPDWQWDQEHEEAFLALKRALVSAL 

CXMAPNSQLPFRLEVTVSHVALTAILHQEHSGRK 

HPIAYTSKPLLPDEESQGPQSGGDSPYAVAWALK 

HFSRCIGDTP WLDLS YA SRTTADPE VREGRRVS 

KAWLIRWSLLVQDKGKRALELALLQGLLGENRL 

LTPAASMPRFFQVLPPFSDLSTFVCIHMSGYCFYR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A a Alanine OCystetne, D=Aapartic Add, 
&=Glutamic Add, ^Phenylalanine, G=Glydne» H=Histidine, 
I^Isolendne, KpLysine, l^Lendne, M^Methionine, 
N=Asparagine, P*=Proline, Q^GlntamJne, R=Arginine, S=>Serine, 
"^Threonine, V«Valine, W-Tryptophan, Y=Ty reside, 
X=Ud known, *=Stop codon, ^possible nodeotide ddetion, 
V=pcssible nudcotide insertion 










EDEWCAGFGLYVI^PTSPPVSLSFSCSPYTPTYA 

HLAAVACGLERFGQSPLPVVrXTHCNWEFSLLWE 

LLPLWRARGFLSSDGAPLPHPSLLSYESLTSGLSS 

LPFIYRTSYRGSLFAVTVDTLAKQGAQGGGQWW 

SLPKDVPAPTVSPHAMGKRPNLLALQLSDSTLAD 

IIARLQAGQKLSGSSPFSSAFNSLSLDKESGLLMF 

KGDKKPRVWVVPTQLRRDLIFSVHDIPLGAHQR 

PEETYKKLRLLGWWPGMQEHVKDYCRSCLFCIP 

RNLIGSELKVIESPWPLRSTAPWSNLQffiWGPVT 

ISEEGHKHVLIVADPNTRWVEAFPLKPYTHTAVA 

QVLLQHVFARWGVPVRLEAAQGPQFARHVLVS 

CGLALGAQVASLSRDLQFPCLTSSGAYWEFKRA 

LKEFnT.HGKKWAASU>LLHLAFRASSTDATPFK 

VLTXjGESRLTEPLWWEMSSANIEGLKMDVFLLQ 

LVGELLEUT^VADKASEKAENRIOTCRESQEKE 

WNVGDQVLLLSLPRNGSSAKWVGPFYIGDRLSL 

SLYRIWGFPTPEKLGCIYPSSLMKAFAKSGTPLSF 

KVLEQ 


3924 

l 


A 


1 


1826 

• 


MGSVTWYFCYGCLFI^ATWTVLLFVYFNFSEV 

TQPLKNVPVKGSGPHGPSPKKFYPRFTRGPSRVL 

EPQFKANKIDDVIDSRVEDPEEGHLKFSSELGMF 

NERDQELRDLGYQKHAFNMLISDRLGYHRDVPD 

TRNAACKEKFYPPDU>AASWICFYNEAFSALLR 

TVHSVIDRTPAHLLHEIILVDDDSDFDDLKGELDE 

YVQKYLPGKJKVIRNTKREGLIRGRM1GAAHATG 

EVLVFLDSHCEVNVMWLQPLLAAIREDRHTVGC 

PVIDnSADTLAYSSSPVVRGGFNWGLHFKWDLV 

PLSELGRAEGATAPIKSPTMAGGLFAMNRQYFH 

ELGQYDSGMDIWGGENLEISFRIWMCGGKLFIIP 

CSRVGjHQFRKRRPYGSPEGQDTMTHNSLRLAHV 

WLDEr "rC^SL^^DL^IT' : ~ USER Y^Li; "VT 

GCKSFXWYLDNVYPEN:-. KFQQPIFVNR 

GPKRPKVLQRGRLYHLQih'LCLVAQGRPSQKG 

GL VVLKACDYSDPNQI WIY WI^EMELVLNSLLCLD 

MSETRSSDPPRIJMKCHGSGGS(^WTFGKNNRLY 

QVSVGQCLRAVDPLGQKGSVAMAICDGSSSQQ 

WHLEG 


3925 


A 


5386 


2897 


VRWNSKTECYLSIQTQENFPANLNELVNCIV1SSL 

VTTQRKLKAMSLLGSRNQLARAVLNFNPMDFCT 

KDLLTTTSERIIAYLRDFNEIXJKKAJETAYAMVK 

HSPSVAKICLfflGPPGTGKSKTIVGLLYRLLTENQ 

RKGHSDENSNAKIKQNRVLVCAPSNAAVDELM 

KKIILEFKEKCKDKKNPLGNCGDINLVRLGPEKSI 

NSEVLKFSLDSQVNHRMKKEU>SHVQAMHKRK 

EFLDYQLDELSRQRALCRGGREIQRQELDENISK 

VSKERQEIASKIKEVQGRPQKTQSIIILESHnCCT 

LSTSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEI 

E1LTPLIHRCNKLILVGDPKQLPPTVISMKAQEYG 

YDQSMMARFCRLLEENVEHNMISRLPILQLTVQ 

YRMHPDICLFPSNYVYmNIJCT^QTEAIRCSSD 

WPFQPYLVFDVGDGSERRDNDSYINVQEIKLVM 

EIIKLIKDKRKDVSFRNIGnTHYKAQKTMIQKDL 

DKEFDRKGPAEVDTVDAFQGRQKDCVTVOXTVRA 

NSIQGSIGFIASLQRLJmTniAKYSLFILGHLRTL 

MENQHWNQLIQDAQKRGAIIKTCDKNYRHDAV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Add, 
E=Glutamic Add, F-Phenylalanine, G^-Glydne, H-Histidine, 
I-Isoleudnt, KHLyslne, l>Lcudne, MHMethionJot, 
N=Asparagine, P=ProiJiie, Q=Glutamine, R»ArgJnlne, S=Serine, 
T^Threonlne, V-Valine, W=Tryptophaii, Y^Tyrosine, 
X=Unknown, *=Stop codon, A^possible nudeotide deletion, 
^possible nudeotide insertion 










KIOa^KPVLQRSLTHPPTIAPEGSRPQGGLreSKL 

DSGFAKTSVAASLYHTPSDSKETILTVTSKDPERP 

PVHDQLQDPRLLKRMGEEVKGGIFLWDPQPSSPQ 

HPGATPPTGEPGFPWHQDLSHVQQPAAWAAL 

SSHKPPVRGEPPAASPBASTCQSKCDDPEEELCH 

RREARAFSEGEQEKCGSETHHTRRNSRWDKRTL 

EQEDSSSKKRKLL 


3926 


A 


99 


284 


MPREDRATWKSNYFIJCIIQLLDDYPKRFrVGANN 
VGSKQMQQIRMSLRGKAVVLMGKNTMMR 


3927 


A 


542 


2 


AHLLMLNLAL\TDLL\YLTSLPFL1HYYASGENW1 
FGDFMCKFIRFSFHFNLYSSILFLTCFSIFRYCVIIH 
PMSCFSIHKTRCAWACAVVWnSLVAVlPMTFLI 
TSTNRTNRSACU)LTSSDELNTIKWYNLILTA\LL 
CLPLVIVTLCYTTIIHTLTHGHAN\DSCLKQKARR 
LTELLL 


3928 


A 


1 


1516 


GEEAVGGGAEGGGFGVGAQGRAGGRGVEAGR 

MRLSKTLVDMDMADYSAALDPAYTTLEFENVQ 

VLTMGNDTSPSEGTNLNAPNSLGVSALCAICGDR 

ATGKHYGASSCDGCKGFFRRSVRKNHMYSCRFS 

RQCVVT)KDKRNQCRYCRIJKXCFRAGMKKEAV 

QNERDRISTRRSSYEDSSLPSINALLQAEVLSRQIT 

SPVSGINGDIRAKBCIASIADVCESMKEQLLVLVE 

WAKYIPGFCELPLDDQGALLRAHAGEHLLLGAT 

KRSMWKDVLLLGNDYTVPRHCPELAEMSRVSIR 

BLDELVLPFQELQIDDNEYAYLKAIIFFDPDAKGL 

SDPGKDCRLRSQVQVSLEDYINDRQYDSRGRFGE 

1XLLLPTLQSITWQMIEQIQFIKLFGMAKIDNLLQ 

EMLLGGSPSDAPHAHHPLHPHLMQEHMGTNVIV 

ANTMPTHLSNGQMCEWPRPRGQAATPETPQPSP 

PGASGSEPYKLLPGAVA1TVKPLSAIPQPTITKQE 

(/ * * 


3929 


A 


1 


2782 

* 


LDSAFRLFPDPRAGPWNTAVLSSGMEPETALWG 

PD7 .QGPEQSPNDAHRGAESENEEESPRQESSGEEI 

INxl;DFAQSPESKDSTEMSI^RSSQDPSVPQNPm 

LGHSNPU3HQIPLDPPAPEVVPTPSDWTKACEAS 

WQWGALTTWNSPPWPANEPSLRELVQGRPAG 

AEKPYICNECGKSFSQWSKLLRHQRIHTGERPNT 

CSECGKSFTQSSHLVQHQRTHTGEKPYKCPDCG 

KCFSWSSNLVQHQRTHTGEKPYKCTECEKAFTQ 

STNLIKHQRSHTGEKPYKCGECRRAFYRSSDLIQ 

HQATHTGEKPYKCPECGKRFGQNHNLLKHQKIH 

AGEOYRCreCGKSHQSSELTQHQRTHTGEKPY 

ECLECGKSFGHSSTLIKHQRTHLREDPFKCPVCG 

KTFILSAT1JJUIQRTHTGERPYKCPECGKSFSVS 

SNLJNHQRmGERPYICADCGKSFIMSSTLIRHQ 

RIHTGEKPYKCSDCGKSFIRSSHLIQHRRTHTGEK 

PYKCPECGKSFSQSSNLmr/RTHMDENLFVCSD 

CGKAFLEAHELEQHRVIHERGKTPARRAQGDSL 

LGLGDPSLLTPPPGAKPHKCLVCGKGFNDEGIFM 

QHQRIKGEWYKNAIXjLIAH 

PFRGNSYPGAAEGRAEAPGQPLKPPEGQEGFSQR 

RGLLSSKTYICSHCGESFLDRSVLLQHQLTHGNE 

KPFLFPDYRIGLGEGAGPSPFLSGKPFKCPECKQS 

FGLSSELLLHQKVHAGGKSSHKSPELGKSSSVLL 
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SEQID ~i 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D^Aspartic Add, 
{/^Glutamic Add, F-Pheny tola nine, G==G1ydne, H^Histidine, 
Msofendne, K=Lysioe, IHLendne, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, RpArglnine, S=Serine, 
T^Threonlne, V»Valine, W=Tryptophan, Y-Tyrosine, 
X^Unknown, *=Stop codon, Accessible nucleotide deletion, 
V*possible nndeotide insertion 










EHLRSPLGARPYRCSIX:RASFli)RVALTRHQETH 
TQEKPPNPEDPPPEAVTI^TDQEGEGETPTPTESS 

HRSCHPGVSL 


3930 


A 


513 


273 


KTQETHIYISEHJUTPFIX^FGNLPICMAKTDLSLS 
wnprnrif nvp^DFTi PT^invp a <jto a hftvpt vrjTr: 

n\^rLfjSJ\\J v rour hatijoU VisAoHJAUr i x rL V vj 1 \j 

SRESPLWL 


3931 


A 


16 


305 


KRRDFl^CWPAFTVLGEARGDQVDWSKLYKDT 

m \7V"\4CPVPP A QQPT7CT^JNrLTDCT r DVT>'D/T.I> HVUDT T 

OJL V JSJV1 oi\JSJ\KJ\ oorr oWrJnJro I r JSJ^OKuJSJnLrLl 
PGPEAl^KFPRQPIREKGPVKEVPGTKGSP 


3932 


A 


16 


305 


KRRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSIPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3933 


A 


1 


1546 


sthasehwdsalqlakhlapdqipf1skeyaiqle 

fagdyvnalahyekgitgdnkehdeaclagva 

qmsirmgdirrgvnqalkhpsrvucrix:gai^ 

nmkqfseaaqlyekglyydkaasvyirsbcnwa 

kvgdllphvsspkihlqyakakeadgrykeavv 

ayenakqwqsviriyldhlnnpekavnivretq 

sijdgakmvarfrxqlgdygsaiqflvmskcnne 

aftlaqqhnkmeiyadiig sedttnedyqsial y 

fegekrylqagkffixcgqysralkhflkcpsse 

dnva1emaeetvgqakdelltnqlidhllgend 

gmpkdakylfrlymalkqyreaaqtaii1aree 

QSAGNYRNAHDVLFSMYAELKSQKIKIPSEMAT 

NLMILHSYILVKIHVKNGDHMKGARMLIRVANN 

ISKFPSHIVPILTSTVIECHRAGLKNSAFSFAAML 

MRPEYRSKIDAKYKKKDBGMVRRPDISEIEEATTP 

CPFCKFLLPESELL 


3934 


A 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCLSVS 

G1GGFLVSLSSFJ i^LQTLAVSVT A T .vttwr 4YVi- 
Cqi l t r .^iDALRll'LEQiuURRMC. v 7^ 
A*0\i>TOTQKlJ^CLIGW^ 
LGVRYLTLTHTCNTPWAESSAKGVli^FYTWISGL 

TT\rOT?VT n 7 4 r )| A"KTTi T TV JT%. /T\ TV\1 Ctn/CT\ AN-' * AT 

TDFGl^VVAEMNlU-GMMVDLSHVS A xjXAL 

EVSQAPVIFSHSAARGVCNSARNVPDDILQLLEE 

ERWAFVMVSU^Gl^IQWQPIRPMCSTVADHFD 

HKAWGSKHGIGGDYIXJAGKYRKKTTCKAPW 

RTSSRMSS 


3935 


A 


1 


883 


HETTPAWQSVLLERGWNKFDKQEQNAEDWNL 

YWRTSSFRMraHNSVKPWQQLNfiWGTTXLTR 

KTC1AKHLKHMRRMYGTSLYQOTLTFVMPNDY 

TKWAEYFQERQMU3TKHSYWICKPAELSRGRG 

iL-irolJr KJJr irUUm Yl V V£ Jv I xoiNrLJLJ.uK x JJJLK 

IYVCVTGFKPLTTYWQEGLVRFATEKFDLSNLQ 

NWAmTNSSINKSGASYEKIKl^IGHGCKWTLS 

IVL f O I J-»IV»J VT xJ T UxJXJl.iM-t TT AXVUUUT1 T JULr 1 IX^XVLTVf tj 

WFAANCFl^FGFDILIDDNEFHRTG 


3936 


A 


203 


441 


HIAHSLGP1JPKHYQYCVRYLYYQVTKDVIKEFA 
DIXSVKYl^LRSTPRRENATGMTKKTYVESILEGI 
KQSKQENLDIDV 



471 

10/30/2006, EAST Version: 2.0.3.0 



WO 01/57190 



TABLE 7 



PCT/US01704098 



SEQ ID NO: 


Position of end of 


MaxS (MAXIMUM 


MeanS (Mean Score) 




Signal in Amino Acid 


SCORE) 






Sequence 






1 


19 


0.930 


0.680 


2 


24 


0.964 


0.863 


3 


21 


0.990 


0.901 


4 


19 


0.981 


0.942 


5 


22 


0.991 


0.928 


6 


21 


0.956 


0.843 


8 


22 


0.913 


0.718 


9 


17 


0.997 


0.969 


11 


19 


0.930 


0.680 


13 


36 


0.983 


0.863 


14 


28 


0.935 


0.839 


15 


21 


0.997 


0.955 


16 


16 


0.983 


0.944 


17 


18 


0.989 


0.884 


19 


49 


0.996 


0.719 


20 


28 


0.972 


0.920 


21 


23 


0.954 


0.905 


22 


46 


0.955 


0.568 


23 


26 


0.942 


0.654 


24 


19 


0.979 


0.941 


25 


34 


0.884 


0.565 


26 


33 


0.934 


0.584 


27 


17 


0.975 


0.914 


28 


18 


0.980 


0.934 


29 


23 


0.928 


0.718 


30 


26 


0.978 


0.885 


32 


20 


0.946 


0.719 


33 


29 


0.933 


0.671 


35 


25 


0.996 


0.920 


36 


26 


0.903 


0.579 


40 


19 


0.981 


0.942 


47 


25 




&909 . 


53 '- : /■ 


22 


'9.991 


0.928 


55 


24 


0//60 


0.808 


60 


19 


0.986 


0.967 


78 


22 


0.913 


0.718 


86 


20 


0.883 


0.555 


87 


24 


0.982 


0.889 


88 


17 


0.997 


0.969 


115 


19 


0.930 


0.680 


134 


36 


0.983 


0.863 


136 


17 


0.913 


0.696 


137 


19 


0.958 


0.905 


140 


28 


0.935 


0.839 


143 


32 


0.914 


0.740 


153 


21 


0.997 


0.955 


154 


25 


0.913 


0.583 


155 


29 


0.972 


0.857 




30 


0.977 


0.817 


170 


30 


0.977 


0.819 


171 


30 


0.977 


0.819 


175 


47 


0.926 


0.606 


176 


30 


0.968 


0.872 


177 


22 


0.957 


0.791 


192 


43 


0.930 


0.678 
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cirri u\ Mfi' 


PaciHaii £% w an n At 

x OSIUOD OI cQU OI 
Cicmol in Am inn Af*irl 

OI glial 1U rtUlllJU ACIU 

S**fi if pn pp 

IJCllUvllVV 


SCORED 


MoonC fit/loan 

iTieauo ^ivieaD ocorcj 


195 


19 


0.956 


0.860 


202 


21 


0.982 


0.871 




24 


0.957 


0 870 






0 954 


ft 005 


224 


46 


0.955 


0.568 ■ 




26 


0 942 


ft 654 






0 961 


ft 830 




28 


0.994 


0 937 


232 


28 


0.993 


0 896 


234 


19 


0.979 


0 942 


235 


19 


0 979 


0 941 




20 


0 987 

V.JO / 


ft 043 

W.J*t«7 


244 

ATT 


23 


0 929 


0 683 


9Sft 


34 


ft 884 


ft S65 


9S6 


33 


0 034 

V/.X«7 t 


ft 584 




25 


ft 034 


ft 790 

v. / X7 


95Q 


99 


n 060 


ft 871 


264 

tin 


19 


0 052 


ft 753 


965 


17 


n 075 


ft 014 


966 


17 


ft 075 


ft 014 


971 


93 


ft 074 

V. J/*t 


ft 884 


974 
/*♦ 


13 


ft 071 


ft 834 


975 


18 


ft Q8ft 

U.70U 


ft 034 


978 
^ / o 


39 


ft 058 


ft 668 


98n 


94 

Jtrr 


ft 066 


ft RR1 


9R1 


94 


ft 066 


ft 881 


9R£ 


93 


ft 09R 


ft 71 R 


901 


35 


ft 001 


ft R94 


903 


97 
z/ 


ft OSfi 

U.JJO 


ft Rfi£ 


904 


93 


ft 0S9 


ft R97 


301 


26 


0.978 


0.885 


316 


20 


0.946 


0.719 


320 


28 


0.978 


0.726 


i ^ :7 


79 


0.933 


0.671 


J 331 . - ; * 


0.903 ]-..<?! 


345 


7/j 


0.996 


0.920 


349 




0.903 


0.579 


351 


24 • 


0.951 


0.876 


352 


18 


0.944 


0.716 


353 


32 


0.992 


0.854 


354 


27 


0.945 


0.817 


355 


16 


0.922 


0.716 


356 


13 


0.959 


0.818 


357 


23 


0.986 


0.878 


358 


19 


0.904 


0.671 


359 


16 


0.988 


0.951 


360 


15 


0.981 


0.938 


361 


18 


0.944 


0.716 


362 


21 


0.984 


0.869 


363 


40 


0.979 


0.813 


364 


18 


0.883 


0.693 


365 


22 


0.962 


0.908 


366 


22 


0.961 


0.827 


367 


44 


0.941 


0.624 


368 


20 


0.952 


0.791 


369 


22 


0.949 


0.840 


370 


28 


0.957 


0.682 
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SEO ID NO* 


Position of end of 


MaxS (MAXIMUM 


IWpjtnS flVTpan Qi>nrp^ 

l~Jlv«»M»J ^1TJ.C2UI ijvUiC^ 




Signal in Amino Acid 


SCORE) 




Sequence 






372 


28 


0.974 


0.894 


373 


19 


0.972 


0.947 


374 


29 


0.968 


0.785 


375 


19 


0.949 


0.897 


377 


23 


0.962 


0.910 


378 


31 


0.974 


0.895 


379 


26 


0.969 


0.939 


380 


27 


0.945 


0.817 


383 


27 


0.945 


0.817 


384 


25 


0.992 


0.877 


385 


32 


0.983 


0.825 


386 


44 


0.924 


0.564 


387' 


26 


0.971 


0.894 


388 


19 


0.989 


0.862 


389 


24 


0.990 


0.947 


390 


34 


0.942 


0.635 


391 


16 


0.922 


0.716 


394 


19 


0.987 


0.970 


398 


36 


0.992 


0.866 


404 


13 


0.959 


0.818 


417 


23 


0.986 


0.878 


421 


19 


0.904 


0.671 


425 


28 


0.971 


0.717 


431 


16 


0.988 


0.951 


452 


18 


0.944 


0 716 

v* / 1U 


459 


21 


0.991 




468 


21 


0.984 


u«OU7 


478 


40 


0.979 




486 


18 


0.883 


V.U7J 


499 


22 


0.962 


U.7UO 


501 


19 


0.962 


0.877 


514 


44 


0.941 


0.624 


529 


20 


0.952 


0.791 


533 


39 


0.914 


0.719 


548 


•25. 


TK957 ^ 


\682 


561 


28 


0.974 


0.894 


562 


28 


0.974 


0.893 


564 


18 


0.949 




576 


19 


0.972 


0.947 


584 


29 


0.968 


0.785 


585 


28 


0.973 


0.810 


591 


19 


0.949 


0.897 


592 


24 


0.991 


0.954 


594 


20 


0.985 


0.959 


595 


20 


0.985 


0.959 


612 


23 


0.962 


0.910 


619 


31 


0.974 


0.895 


621 


15 


0.959 


0.795 


633 


26 


0.969 


0.939 


640 


20 


0.949 


0.842 


645 


25 


0.911 


0.759 


684 


25 


0.992 


0.877 


691 


32 


0.983 


0.825 . 


698 


44 


0.924 


0.564 


700 


19 


0.982 


0.941 


710 


26 


0.971 


0.894 


714 


23 


0.965 


0.907 
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SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


Ma xS (MAXIMUM 
SCORE) 


MeanS (Mean Score) 


/lo 


19 


0.989 


0.862 




21 


0.976 


0.851 


/xo 


33 


0.961 


0.895 


734 


25 


0.963 


0.660 


741 


34 


0.942 


0.635 


/44 


19 


0.959 


0.924 


HA1 


16 


0.922 


0.716 


OO 


26 


0.973 


0.864 


/Of 


22 


0.986 


0.943 


7oo 


27 


A A1 £. 

0.9 16 


0.758 


769 


19 


0.987 


0.970 


770 


22 


0.981 


0.933 


771 


34 


0.993 


0.893 


773 


20 


0.968 


0.939 


774 


21 


0.971 


0.945 


775 


22 


0.986 


0.943 


779 


32 


0.973 


0.846 


7ol 


All 

23 


0.950 


0.857 


785 


27 


0.916 


0.758 


786 


27 


0.916 


0.758 


788 


22 


0.981 


0.933 


793 


22 


0.986 


0.803 


794 


39 


0.892 


0.654 


797 


27 


0.965 


0.847 


810 


22 


0.981 


0.933 


823 


34 


0.993 


0.893 


825 


17 


0.962 


0.778 


837 


20 


0.968 


0.939 


844 


25 


0.984 


0.951 


845 


17 


0.919 


0.706 


846 


21 


0.971 


0.945 


847 


21 


0.971 


0.945 


890 


22 


0.986 


0.943 . 


893- 

£.94 


24 


0.971 ; 0.865 


24 , xm • 


0.86: . 


896 


32 


0.973 


0.846 


899 


31 


0.982 


0.817 


922 


15 


0.882 


0.706 


924 


21 


0.975 


0.948 


925 


21 


0.927 


0.661 


933 


20 


0.967 


0.906 


960 


20 


0.967 


0.906 


967 


38 


0.970 


0.784 


968 


47 


0.970 


0.557 


972 


36 


0.945 


0.775 


TABLE 8 



SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino add 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartie 
Add, E=Glutamic Add, ^Phenylalanine, G=Glycine> 
H°Histidine, I=Isoleuclne, K=Lyslne, I^Leudne, 
M-Methionlne, N=Asparagine, P^Proltae, Q=Glntamine, 
R-Arginine, S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop codon, 
A=possible nudeotide deletion, \=possSble nudeotide 
insertion 


3955 


A 


235 


1272 


GPREVLAASSLADGSEEQVMAVALVRERDLSFPG 
VGDAWNPTRWHLPAQPEMLYEGGEGRMETLK 
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SEQ 

m 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Ala nine OCysteine, D=Aspartic 
Acid, E=Glutamic Add, ^Phenylalanine, G=Grycine, 
ENKJstidine, Msoleucine, K=Lysine, L=Lendne, 
MNMethionine, N^Asparagfne, P=Prollne, Q^Glutamine, 
R=Arginine, S=Serine, T^Threonine, V^Valine, 
W«Tryptophan, Y=^Tyrosine, X-Un known, *=Stop codon, 
/=possible nucleotide deletion, \=possib!e nucleotide 
insertion 










DKTIXJELEELQNDSEAIDQLALESPEVQDLQLERE 

MA1ATNRSLAERNLEFQGPLEISRSNLSDRYQELR 

KL\nERCQEQKAKLEKFSSALQPGTLLDLLQVEGM 

KIEEESEAMAEKFLEGEVPLETFLENFSSMRMLSH 

LRRVRVEKLQEVVRKPRASQELAGDAPPPRSPPP 

V/PPSPPGNTPCG*RAAAAT1SHASLPFALQPIPQPA 

CGPHCPWSPATGPFPSSVPALLLQRASGPHLPGSP 

AWTQGCCGLLLVPTEEHAAPPYGFPPPPGPAWPG 

Y 


3956 


A 


821 


385 


SICADRTERVGIFFYIPAGTTDEADVTHP*EGHSYL 

SNHAG1QRSSRP/SHYQGEAVHDNCFTADELQLLT 

YQLCHTYVRCTRSVSIPAPAYYAHLVAFRARYHL 

VDKEHDSAEGSHVSGQSNGRDPQALAKAVQIHQ 

DTLRTMYFA 


3957 


A 


4621 


240 

■ 


ELISTFKLLIJEKKRSEVMKMKKRYEVGLEKLDSA 

SSQVATMQMELEALHPQLKVASKEVDEMMIMIE 

KESVEVAKTEKIVKADETIANEQAMASKAIKDEC 

DADIJ^GAI^IIJBSALAAIJDTLTAQDITVVKSMKSP 

PAGVKLVMEAICILKGIKADKIPDPTGSGKKIEDF 

WGPAKRLLGDMRFLQSLHEYDKDNIPPAYMNIIR 

K>TYTPNPDFVPEKIRNASTAAEGLCKWVIAMDSY 

DKVAKWAPKKIKIAAAEGEOOAMDGLRKKQA 

ALKEVQDKLARLQDTLELNKQKKADLENQVDLC 

SKKLERAEQLIGGLGGEKTRWSHTALELGQLYIN 

LTGDILISSGVVAYLGAFISTYRQNQTKEWTTLCK 

GRDIPCSDDCSLMGTLGEAVTTRTWNIAGLPSDSF 

SIDNGmMNARRWPLMIDPQSQANKWIKNMEKA 

NSLYVIKLSEPDYVRTLENCIQFGTPVLLENVGEE 

LDPILEPLLLKQTFKQGGSTCIRLGDSTIEYAPDFR 

FYITTKIJINPHYLPETSVKYTL^^ , 

LLcrvvAQERPDLSEEKQ, Qa^roe.c;. 

DKII^LSSSICun'IlEDET^ 

QEVAEETEKKTOTTRMGYRPIAmSSIUTSI^ 

hHEPMYQYSLTWFINLFIl^IENSEKSEILAKRLQIL 1 ; 

KDHFTYSLYVNVCRSLFEKDKLLFSTCLTTNLLLH 

ERAINKAEWRFLLTGGIGLDNPYANPCTWLPQKS . 

WDEICRLDDLPAITCTIRREFMRLKIX5WKKVYDSL 

EPHHEVFPEEWEDKANEFQRMLORCLRPDKVIPM 

LQEFDNRLGRAFIEPPPFDLAKAFGDSNCCAPLIFV 

LSPGADPMAALLKFADDQGYGGSKLSSLSLGQGQ 

GPIAMKMUEKAVKEGTWVVLQNCHLATSWMPT 

LEKVCEEI^PESTHPDFRMWLTSYPSPNFPVSVLQ 

NGVKMTNEAPKG1JRANIIRSYLMDPISDPEFFGSC 

KKPEEFKKLLYGLCFFHALVQERRKFGPLWWNIP 

YEFOTTDLRISVQQLHMFLNQYEELPYEALRYMT 

GECNY GGRVTOD WDRRTLRSILNKJFFWELVENS 

DYKFDSSGIYFVPPSGDHKSYIEYTKTLPLTPAPEI 

FGMNANADITKDQSETQLLFDNILLTQSRSAGAG 

AKSSDEVVNEVASDEXjKLPNNFDIEAAMRRYPT 

TYTQSMNTN^VQEMGRFNKIXKTIRDSCVNIQKA 

IKGIAVMSTDLEEVVSSILKVKIPEMWMGKSYPS 

LKPLGSYVNDrlJ^RLKFLQQWYEVGPPPVFWLSG 

FFFTQAFLTGAQQNYARKYTIPIDLLGFDYEVMED 

KEYKHPPEDGVFIHGLFLDGASWNRKIKKLAESH 

PKILYDTVPVMWIJCPCKRADIPKRPSYVAPLYKT 
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SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino add 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic 
Acid, E=Giutamic Acid, F=PhenyIalanine, G=Glycine, 
H=Histidine, Msoleucine, K«Lysine, l/=Leudne, 
M=Methionine, N=Asparagine, P=Proline, Q=GIutamine, 
R=Arginine, S=Serine, T=Threonine, V=Va)ine, 
W^ryptophan, Y^Tyrosine, X=Un known, *=Stop codon, 
/^possible nucleotide deletion, \~possible nucleotide 
insertion 










SERRGVLSTTGHSTNFV^\MTLrei>QPKEHWIG 
GVALLCQLNS 


3958 


A 


35 


529 


GADMAKSKNH'llHNQSRKWHRNVKKPLSQRYK 
SIXGVDPKFLGl^CFTKKHKKKGLKKMQADSA 
KA VSTCAKAIE ALVKPKEVKPKIPKG VSCELN* LA 
YIAYPKFWTCACACIAKGLRLCQPKAKAQDQTK 
AQVQDCAQAAAPASVPTQAPKGAQAPTKASG 


3959 


A 


1883 


763 


1XVLLLRTNLLIASSTRISRATLTCSPPG1PVDPRVR 
PRVRSHLVMYLGITTGSLHKAVVSGDSSAHLVEEI 
QUTDPEPVK^QIAFTQGAWVGFSGGVWRVPR 
ANCSVYESCVDCVLARDPHCAWDPESRTCCLLSA 
PNLNSWKQDMERGNPEWACASGPMSRSLRPQSR 
PQIIKEVLAVPNSILELPCPHLSALASYYWSHGPAA 
VPEASSTVYNGSLLLrVQDGVGGLYQCWATENGF 
SYPV1SYWVDSQDQTLALDPELAGIPREHVKVPLT 
RVSGGAAIAAQQSYWPHFVTVTVLFALVLSGALI 
ILVASPLRALRARGKVQGCETLRPGEKAPLSREQH 
LQSPKECRTSASDVDADNNCLGTEVA 


3960 


A 


1 


481 


S Y AAPSLFVKSL YWALAFMA VLLA V SG VVIVVLA 
SRAGARCQQCPPGWVLSEEHCYYFSAEAQAWEA 
SQAFCSAYHATLPLLSHTQDFLGRYPVSRHSWVG 
AWRGPQGWHWIDEAPLPPQ3XPEDGEDNLDINCG 
ALEEGTLVAANCSTPRPWVCAKGTQ 



TABLE 9 



SEQ D> NO: 


Accession 
Number 


Species 


Description 


Smith 

Waterman 

Score 


% Idenity 


?Q37 

T 

1 ' 


Y27700 


Homo sapiens 


H ;-vn secrete-* 
protein encoded by 
gene No. 12. 




or 


! 3938 


AF093097 


Homo sapiens 


putative RNA-binding 
protein Q99 




84 


3939 


AB012308 


Anthocidaris 
crassispina 


B2HC 


4169 


74 


3940 


U10248 


Homo sapiens 


ribosomal protein L29 


787 


95 


3941 


Y99418 


Homo sapiens 


Human PR01317 
(UNQ783) amino acid 
sequence SEQ ID 
NO:277. 


4031 


100 


3942 


AL023516 


Gallus gallus 


B locus C type Lectin 


198 


35 



TABLE 10 



SEQ ID 
NO: 


Accession No. 


Description 


Results* 


3937 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.168e-l 1 209- 
224 


3942 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 6.400e-ll 37- 
55 



~* Results Include in order accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 
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TABLE 11 



SEQID 
NO: 


FFAM Name 


Description 


P-Vatue 


FFAM 
Score 


3938 


Prwi 


Piwi domain 


2.6e-150 


512.7 


3940 


Ribosomal L29e 


Ribosomal L29e protein family 


23e-19 


77.8 


3941 


Sema 


Sema domain 


4e-181 


615.1 


3942 


lectin c 


Lectin C-type domain 


0.086 


-7.1 



TABLE 12 



10 



SEQID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (Maximum Score) 


Means (Mean Score) 


3941 


31 


0.985 


0.926 


3942 


21 


0.974 


0.894 


TABLE 13 



SEQID NO: 
of full length 
nucleotide 
sequence 


SEQID 
NO: of full 
length 
peptide 
sequence 


SEQID NO: 
of con tig 
nucleotide 
sequence 


SEQID NO: 
of con tig 
peptide 
sequence 


Priority Docket 
number 

corresponding SEQ 
ID NO: in priority 
application 


SEQ ED NO: in 
USSN 09/496,914 


3937 


3943 


3949 


3955 


787CIP2G 1 


787 3587 


3938 


3944 


3950 


3956 


787CIP2G 2 


787 3813 


39J39 


3945 


3951 


3957 


787CIP2G 3 


787 4462 


3940 


3946 


3952 


3958 


787CIP2G 4 


787 4887 


3941 


3947 


3953 


3959 


787CIP2G 5 


787 5794 


3942 


3948 


3954 


3960 


787CIP2G 6 


787 8743 



TABLE 14 



TISSUE ORIGIN 


LIBRARY/ 


HYSEQ LIBRARY 


SEQIDNOS: 




KNA SOURCE 


NAME 




adult brain 


Gmco 


ABD003 


3940 


adult brain 


Clontech 


ABR006 


3940 


adult brain 


Invftrogen 


ABR014 


3940 


cultured preadipocytes 


Strategene 


ADP001 


3937 


adult heart 


GIBCO 


AHR001 


3940 


adult kidney 


GIBCO 


AKD001 


3940 


adult lung 


GIBCO 


ALG001 


3940 


young liver 


GIBCO 


ALV001 


3940 


adult ovary 


Invrtrogen 


AOV001 


3938, 3940-3941 


adult spleen 


GIBCO 


ASP001 


3940-3941 


testis 


GIBCO 


ATS001 


3940 


bone marrow 


Clontech 


BMD001 


3938, 3940 


bone marrow 


Clontech 


BMD004 


3940 


adult cervix 


BioGhain 


CVX00I 


3940 


endothelial cells 


Strategene 


EDT001 


3940 


fetal brain 


Clontech 


FBR006 


3940 


fetal brain 


Invitrogen 


FBT002 


3940-3941 


fetal heart 


Invrtrogen 


FHR001 


3940 


fetal kidney 


Clontech 


FKD001 


3940 


fetal kidney 


Clontech 


FKD002 


3940 
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TISSUE ORIGIN 


LIBRARY/ 

IT* XT A C*^Vf TO f^W? 

RNA SOURCE 


HYSEQ LIBRARY 
NAME 


SEQIDNOS: 


fetal liver-spleen 


Columbia 
University 


FLS001 


3937, 3940 


fetal liver-spleen 


Columbia 
University 


■ *¥ ft AAA 

FLS002 


3938, 3941 


fetal liver-spleen 


Columbia 
University 


FLS003 


3940 


fetal liver 


Clontech 


FLV004 


3940 


fetal skin 


Invitrogen 


FSKOOl 


3940-3942 


fetal spleen 


BioChain 


FSP001 


3940 


ietal brain 


GIBCO 


HFB001 


3937, 3940-3941 


infant brain 

— 


Columbia 
University 


IB2002 


3937, 3939, 3941 


leukocyte 


GIBCO 


LUC001 


y% A. ^ A A aJ + 

3940-3941 


leukocyte 


Clontech 


LUC003 


3940-3941 


melanoma from cell line ATCC 
ffCRL 1424 


Clontech 


MEL004 


3940 


mammary gland 


Invitrogen 


MMG001 


A A A A At A. A A\ 

3937, 3940-3941 


neuronal cells 


Strategene 


NTU001 


3937, 3942 


prostate 


Clontech 


PRT001 


3938 


rectum 


Invitrogen 


REC001 


3940 


salivary gland 


Clontech 


SALs03 


3941 


small intestine 


Clontech 


OTVfAA -1 

SIN001 


3940 


SKcieiai muscie 


ciontecn 


CY\Af\t\t 

dKMUUl 


3940 


spinal cord 


Clontech 


SPC001 


3940 


thymus 


Clontech 


THMc02 


3938 


thyroid gland 


Clontech 


THR001 


3942 


uterus 


Clontech 


UTR001 


3940 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-984, 1 969-2952, 3937-3942 or 3949-3954, a full length protein 
coding portion of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a mature protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active domain 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

.» *■ 

7 . An expression vector comprising the polynucleotide of claim 1 . 

8 . A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 
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11. A composition comprising the polypeptide of claim 1 0 and a carrier. 



12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

16. A metbec r^r dete-tLiig di~ polype^ o;f ^aim iv in a rumple, comprising: . "$iSfe 

a) contacting the sample ^;th a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and v 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 
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18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified. 



19. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected fromm 
the group consisting of SEQ ID NO: 1-984; 1969-2952, 3937-3942 or 3949-3954, a mature 
protein coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active 
domain coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, under conditions 
sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 

39jj-39;/v l-;r mature protrln portion thereof, or the -c^rve doii:^ th^eCi. *. , 

2 1 . The poly; ^ tide of claim 20 wherein the polypeptide is provided on a polypeptid ^ array? 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 

23 . The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 
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26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 

27. ' A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutical^ acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 1 0 or 20 and a pharmaceutical^ acceptable carrier. 
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Pages 485 to 6221 of this application contain amino acid sequence listings. 
They can be obtained at the address given below. 

Les pages 485 to 6221 de cette demande contiennent des listages des s6quences 
d'acides amines. Elles peuvent dtre obtenues d I'adresse indiqu6e ci-dessous. 
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